Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Efficiency analysis of item response theory kernel equating for mixed-format tests
Umeå University, Faculty of Social Sciences, Umeå School of Business and Economics (USBE), Statistics.
Umeå University, Faculty of Social Sciences, Umeå School of Business and Economics (USBE), Statistics.ORCID iD: 0000-0002-1812-3581
Umeå University, Faculty of Social Sciences, Umeå School of Business and Economics (USBE), Statistics.ORCID iD: 0000-0001-5549-8262
2023 (English)In: Applied psychological measurement, ISSN 0146-6216, E-ISSN 1552-3497, Vol. 47, no 7-8, p. 496-512Article in journal (Refereed) Published
Abstract [en]

This study aims to evaluate the performance of Item Response Theory (IRT) kernel equating in the context of mixed-format tests by comparing it to IRT observed score equating and kernel equating with log-linear presmoothing. Comparisons were made through both simulations and real data applications, under both equivalent groups (EG) and non-equivalent groups with anchor test (NEAT) sampling designs. To prevent bias towards IRT methods, data were simulated with and without the use of IRT models. The results suggest that the difference between IRT kernel equating and IRT observed score equating is minimal, both in terms of the equated scores and their standard errors. The application of IRT models for presmoothing yielded smaller standard error of equating than the log-linear presmoothing approach. When test data were generated using IRT models, IRT-based methods proved less biased than log-linear kernel equating. However, when data were simulated without IRT models, log-linear kernel equating showed less bias. Overall, IRT kernel equating shows great promise when equating mixed-format tests.

Place, publisher, year, edition, pages
Sage Publications, 2023. Vol. 47, no 7-8, p. 496-512
Keywords [en]
item response theory, kernel equating, log-linear models, presmoothing, simulation
National Category
Probability Theory and Statistics
Identifiers
URN: urn:nbn:se:umu:diva-215929DOI: 10.1177/01466216231209757ISI: 001087283200001Scopus ID: 2-s2.0-85174542085OAI: oai:DiVA.org:umu-215929DiVA, id: diva2:1809187
Funder
Marianne and Marcus Wallenberg Foundation, 2019.0129Available from: 2023-11-02 Created: 2023-11-02 Last updated: 2025-04-24Bibliographically approved
In thesis
1. Extensions and applications of item response theory
Open this publication in new window or tab >>Extensions and applications of item response theory
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Alternative title[sv]
Vidareutveckling och tillämpningar av item response theory
Abstract [en]

This doctoral thesis focuses on Item Response Theory (IRT), a statistical method widely used in fields such as education and psychology to analyze response patterns on tests and surveys. In practice, IRT models are estimated using collected test data, which allows researchers to assess both how effectively each item measures the underlying trait—such as subject knowledge or personality characteristics—that the test aims to evaluate, and to estimate each individual's level of that trait. Unlike traditional methods that simply sum predetermined item scores, IRT accounts for the difficulty of each item and its ability to measure the intended trait.

The thesis consists of four research articles, each addressing different aspects of IRT and its applications. The first article focuses on test equating, ensuring that scores from different versions of a test are comparable. Equating methods with and without IRT are compared using simulations to explore the advantages and disadvantages of incorporating IRT into the kernel equating framework. The second and third articles introduce and compare different types of IRT models. Through simulations and real test data examples, these studies demonstrate that more flexible models can better capture the true relationships between test responses and the underlying traits being measured.

Finally, the IRTorch Python package is presented in the fourth study. IRTorch supports various IRT models and estimation methods and can be used to analyze data from different types of tests and surveys. In summary, the thesis demonstrates how IRT-based equating methods can serve as an alternative to traditional equating methods, how more flexible IRT models can improve the precision of test results, and how user-friendly software can make advanced statistical models accessible to a wider audience.

Place, publisher, year, edition, pages
Umeå: Umeå University, 2025. p. 25
Series
Statistical studies, ISSN 1100-8989 ; 60
Keywords
Machine learning, Autoencoders, Item response theory, psychometrics, Test equating, Statistical software, Educational assessment, Latent variable modelling
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:umu:diva-233351 (URN)978-91-8070-572-1 (ISBN)978-91-8070-571-4 (ISBN)
Public defence
2025-02-07, HUM.D.220 (Hjortronlandet), Humanisthuset, Umeå university, Umeå, 09:00 (English)
Opponent
Supervisors
Available from: 2025-01-08 Created: 2024-12-31 Last updated: 2025-01-08Bibliographically approved

Open Access in DiVA

fulltext(1142 kB)143 downloads
File information
File name FULLTEXT02.pdfFile size 1142 kBChecksum SHA-512
fe18ff94460b94f0ad67c1b11fb370c5a464c652d55655ff5f6abcf6d6a396f391641cb0063098b2dd233a582db227f6d86b4db816ec6b6d4cfe239630ff15c8
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Wallmark, JoakimJosefsson, MariaWiberg, Marie

Search in DiVA

By author/editor
Wallmark, JoakimJosefsson, MariaWiberg, Marie
By organisation
Statistics
In the same journal
Applied psychological measurement
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar
Total: 154 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 355 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf