Change search
ReferencesLink to record
Permanent link

Direct link
The effect of item revisions on classical item statistics and preequating outcomes
Umeå University, Faculty of Social Sciences, Department of Educational Measurement.
(English)Manuscript (preprint) (Other academic)
Abstract [en]

The purpose of this study was to examine the effect of textual item revisions on classical item statistics and the adequacy of preequating outcomes. Three forms of the SweSAT data sufficiency subtest, comprising a total of 66 items, were examined. The items were categorized after type of revision, and the differences in p-values and point-biserial correlations between regular test and pretest were averaged in each category. These averages were subjected to a t-test, and it was found that revisions have a significant effect on both difficulty and discrimination indices. Also, while the preequating method used in this study produced adequate results, the revisions seem to increase the amount of error in preequating outcomes.

Keyword [en]
achievement testing, textual revisions, item statistics, item pretesting, preequating
URN: urn:nbn:se:umu:diva-25431OAI: diva2:231764
Available from: 2009-08-17 Created: 2009-08-17 Last updated: 2010-01-14
In thesis
1. A perfect score: Validity arguments for college admission tests
Open this publication in new window or tab >>A perfect score: Validity arguments for college admission tests
2009 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

College admission tests are of great importance for admissions systems in general and for candidates in particular. The SweSAT (Högskoleprovet in Swedish) has been used for college admission in Sweden for more than 30 years, and today it is alongside with the upper-secondary school GPA the most widely used instrument for selection of college applicants. Because of the importance that is placed on the SweSAT, it is essential that the scores are reliable and that the interpretations and uses of the scores are valid. The main purpose of this thesis was therefore to examine some assumptions that are of importance for the validity of the interpretation and use of SweSAT scores. The argument-based approach to validation was used as the framework for the evaluation of these assumptions.The thesis consists of four papers and an extensive introduction with summaries of the papers. The first three papers examine assumptions that are relevant for the use of SweSAT scores for admission decisions, while the fourth paper examines an assumption that is relevant for the use of SweSAT scores for providing diagnostic information. The first paper is a review of predictive validity studies that have been performed on the SweSAT. The general conclusion from the review is that the predictive validity of SweSAT scores varies greatly among study programs, and that there are many problematic issues related to the methodology of the predictive validity studies. The second paper focuses on an assumption underlying the current SweSAT equating design, namely that the groups taking different forms of the test have equal abilities. The results show that this assumption is highly problematic, and consequently a more appropriate equating design should be applied when equating SweSAT scores. The third paper examines the effect of textual item revisions on item statistics and preequating outcomes, using data from the SweSAT data sufficiency subtest. The results show that most kinds of revisions have a significant effect on both p-values and point-biserial correlations, and as a consequence the preequating outcomes are affected negatively. The fourth paper examines whether there is added value in reporting subtest scores rather than just the total score to the test-takers. Using a method derived from classical test theory, the results show that all observed subscores are better predictors of the true subscores than is the observed total score, with the exception of the Swedish reading comprehension subtest. That is, the subscores contain information that the test-takers can use for remedial studies and hence there is added value in reporting the subscores. The general conclusion from the thesis as a whole is that the interpretations and use of SweSAT scores are based on several questionable assumptions, but also that the interpretations and uses are supported by a great deal of validity evidence.

Place, publisher, year, edition, pages
Umeå: Institutionen för Beteendevetenskapliga mätningar, Umeå Universitet, 2009. 58 p.
Academic dissertations at the department of Educational Measurement, ISSN 1652-9650 ; 4
college admission tests, SweSAT, validity, interpretive arguments, predictive validity, equating, item revisions, subscores
urn:nbn:se:umu:diva-25433 (URN)
Beteendevetenskapliga mätningar, 90187, Umeå
Public defence
2009-09-25, Hörsal E, Humanisthuset, Umeå Universitet, Umeå, 13:15 (English)
Available from: 2009-09-04 Created: 2009-08-17 Last updated: 2009-09-04Bibliographically approved

Open Access in DiVA

No full text

By organisation
Department of Educational Measurement

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 20 hits
ReferencesLink to record
Permanent link

Direct link