A Comparison between Common Item Equating with Pre-and Post-Reading and Listening Tests
Abstract: Although our final goal is constructing pre-and post-e-testing systems, as our first stage, we need to find the best system to equate pre-and post-tests in order to measure about 6,600 students’ growth in two years in a university language program. After conducting the evaluation procedure, we found that the 3PLcFix model could be the more appropriate one on reading tests while the 2PL model was the better fit for listening tests. The average gains in ability of reading and listening among the students who took pre-and post-tests were calculated separately with applying the more appropriate model. Results revealed that the average gain of reading ability was 0.3006 smaller than that of their listening ability. This could be because of larger equating errors in reading tests based on a passage whose ten common items could be locally dependent. Another possibility is that participants may have a general tendency of performing better or worse on passage-based items than on discrete items.