Change search
ReferencesLink to record
Permanent link

Direct link
Systematic equating error with the randomly-equivalent groups design: An examination of the equal ability distribution assumption
Umeå University, Faculty of Social Sciences, Department of Educational Measurement.
University of Massachusetts.
(English)Manuscript (preprint) (Other academic)
Abstract [en]

The equal ability distribution assumption associated with the randomly-equivalent groups equating design was investigated in the context of a selection test for admission to higher education. Test-takers’ scores on anchor items from two subtests were estimated using information about test-taker performance on the regular subtests. The results indicated that the anchor test item performance varied sufficiently so that the equal ability distribution assumption could be questioned. Consequently, our conclusion is that more caution when applying the randomly-equivalent groups design in the equating of tests is needed. Assuming equal ability groups is a convenient assumption to make but it also can lead to systematic bias in the equating of test scores and this study provides a demonstration of that point.

Keyword [en]
equating error, randomly-equivalent groups design, anchor tests, college admission tests
URN: urn:nbn:se:umu:diva-25430OAI: diva2:231762
Available from: 2009-08-17 Created: 2009-08-17 Last updated: 2010-01-14
In thesis
1. A perfect score: Validity arguments for college admission tests
Open this publication in new window or tab >>A perfect score: Validity arguments for college admission tests
2009 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

College admission tests are of great importance for admissions systems in general and for candidates in particular. The SweSAT (Högskoleprovet in Swedish) has been used for college admission in Sweden for more than 30 years, and today it is alongside with the upper-secondary school GPA the most widely used instrument for selection of college applicants. Because of the importance that is placed on the SweSAT, it is essential that the scores are reliable and that the interpretations and uses of the scores are valid. The main purpose of this thesis was therefore to examine some assumptions that are of importance for the validity of the interpretation and use of SweSAT scores. The argument-based approach to validation was used as the framework for the evaluation of these assumptions.The thesis consists of four papers and an extensive introduction with summaries of the papers. The first three papers examine assumptions that are relevant for the use of SweSAT scores for admission decisions, while the fourth paper examines an assumption that is relevant for the use of SweSAT scores for providing diagnostic information. The first paper is a review of predictive validity studies that have been performed on the SweSAT. The general conclusion from the review is that the predictive validity of SweSAT scores varies greatly among study programs, and that there are many problematic issues related to the methodology of the predictive validity studies. The second paper focuses on an assumption underlying the current SweSAT equating design, namely that the groups taking different forms of the test have equal abilities. The results show that this assumption is highly problematic, and consequently a more appropriate equating design should be applied when equating SweSAT scores. The third paper examines the effect of textual item revisions on item statistics and preequating outcomes, using data from the SweSAT data sufficiency subtest. The results show that most kinds of revisions have a significant effect on both p-values and point-biserial correlations, and as a consequence the preequating outcomes are affected negatively. The fourth paper examines whether there is added value in reporting subtest scores rather than just the total score to the test-takers. Using a method derived from classical test theory, the results show that all observed subscores are better predictors of the true subscores than is the observed total score, with the exception of the Swedish reading comprehension subtest. That is, the subscores contain information that the test-takers can use for remedial studies and hence there is added value in reporting the subscores. The general conclusion from the thesis as a whole is that the interpretations and use of SweSAT scores are based on several questionable assumptions, but also that the interpretations and uses are supported by a great deal of validity evidence.

Place, publisher, year, edition, pages
Umeå: Institutionen för Beteendevetenskapliga mätningar, Umeå Universitet, 2009. 58 p.
Academic dissertations at the department of Educational Measurement, ISSN 1652-9650 ; 4
college admission tests, SweSAT, validity, interpretive arguments, predictive validity, equating, item revisions, subscores
urn:nbn:se:umu:diva-25433 (URN)
Beteendevetenskapliga mätningar, 90187, Umeå
Public defence
2009-09-25, Hörsal E, Humanisthuset, Umeå Universitet, Umeå, 13:15 (English)
Available from: 2009-09-04 Created: 2009-08-17 Last updated: 2009-09-04Bibliographically approved

Open Access in DiVA

No full text

By organisation
Department of Educational Measurement

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 35 hits
ReferencesLink to record
Permanent link

Direct link