umu.sePublications
Change search
Link to record
Permanent link

Direct link
BETA
Alternative names
Publications (10 of 83) Show all publications
Lindvall, J., Helenius, O. & Wiberg, M. (2018). Critical features of professional development programs: Comparing content focus and impact of two large-scale programs. Teaching and Teacher Education: An International Journal of Research and Studies, 70, 121-131
Open this publication in new window or tab >>Critical features of professional development programs: Comparing content focus and impact of two large-scale programs
2018 (English)In: Teaching and Teacher Education: An International Journal of Research and Studies, ISSN 0742-051X, E-ISSN 1879-2480, Vol. 70, p. 121-131Article in journal (Refereed) Published
Abstract [en]

By comparing two large-scale professional development programs' content and impact on student achievement, we contribute to research on critical features of high quality professional development, especially content focus. Even though the programs are conducted in the same context and are highly similar if characterized according to established research frameworks, our results suggest that they differ in their impact on student achievement. We therefore develop an analytical framework that allow us to characterize the programs' content and delivery in detail. Through this approach, we identify important differences between the programs that provide explanatory value in discussing reasons for their differing impacts.

Place, publisher, year, edition, pages
PERGAMON-ELSEVIER SCIENCE LTD, 2018
Keywords
Content focus, Student achievement, Teacher professional development
National Category
Pedagogy
Identifiers
urn:nbn:se:umu:diva-144940 (URN)10.1016/j.tate.2017.11.013 (DOI)000423641900012 ()
Available from: 2018-02-23 Created: 2018-02-23 Last updated: 2018-06-09Bibliographically approved
Wiberg, M. (2018). equateIRT Package in R. Measurement, 16(3), 195-202
Open this publication in new window or tab >>equateIRT Package in R
2018 (English)In: Measurement, ISSN 1536-6367, E-ISSN 1536-6359, Vol. 16, no 3, p. 195-202Article in journal (Refereed) Published
Abstract [en]

Equating test scores between different achievement test versions is important to assure comparability between test takers’ scores. As many items are modelled with item response theory (IRT), it makes sense to also equate the test scores with IRT equating methods. The equateIRT package in R provides a set of functions which implements IRT equating methods including newer extensions. This paper summarizes some of the advances in equating with IRT, reviews the equateIRT package, and demonstrates, through two illustrative examples, some of the key features of the package.

Place, publisher, year, edition, pages
Routledge, 2018
Keywords
IRT observed-score equating, IRT true-score equating, equating coefficients, bisector method, IRT
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:umu:diva-151901 (URN)10.1080/15366367.2018.1492866 (DOI)000444115500005 ()
Funder
Swedish Research Council, 2014-578
Available from: 2018-09-17 Created: 2018-09-17 Last updated: 2018-10-05Bibliographically approved
Leôncio, W. & Wiberg, M. (2018). Evaluating equating transformations from different frameworks. In: Marie Wiberg, Steven Culpepper, Rianne Janssen, Jorge González, Dylan Molenaar (Ed.), Quantitative Psychology: The 82nd annual meeting of the psychometric society, Zurich, Switzerland, 2017 (pp. 101-110). Cham, Switzerland: Springer
Open this publication in new window or tab >>Evaluating equating transformations from different frameworks
2018 (English)In: Quantitative Psychology: The 82nd annual meeting of the psychometric society, Zurich, Switzerland, 2017 / [ed] Marie Wiberg, Steven Culpepper, Rianne Janssen, Jorge González, Dylan Molenaar, Cham, Switzerland: Springer , 2018, p. 101-110Chapter in book (Refereed)
Abstract [en]

Test equating is used to ensure that test scores from different test forms can be used interchangeably. This paper aims to compare the statistical and computational properties from three equating frameworks: item response theory observed-score equating (IRTOSE), kernel equating and kernel IRTOSE. The real data applications suggest that IRT-based frameworks tend to providemore stable and accurate results than kernel equating. Nonetheless, kernel equating can provide satisfactory results if we can find a good model for the data, while also being much faster than the IRT-based frameworks. Our general recommendation is to try all methods and examine how much the equated scores change, always ensuring that the assumptions are met and that a good model for the data can be found.

Place, publisher, year, edition, pages
Cham, Switzerland: Springer, 2018
Series
Springer Proceedings in Mathematics & Statistics, ISSN 2194-1009, E-ISSN 2194-1017 ; 233
Keywords
Test equating, item response theory, kernel equating, observed-score equating
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:umu:diva-147092 (URN)10.1007/978-3-319-77249-3 (DOI)978-3-319-77248-6 (ISBN)978-3-319-77249-3 (ISBN)
Available from: 2018-04-26 Created: 2018-04-26 Last updated: 2018-06-09
Wallin, G., Häggström, J. & Wiberg, M. (2018). How to select the bandwidth in kernel equating — An evaluation of five different methods. In: Marie Wiberg, Steven Culpepper, Rianne Janssen, Jorge González, Dylan Molenaar (Ed.), Quantitative Psychology: The 82nd annual meeting of the psychometric society, Zurich, Switzerland, 2017 (pp. 91-100). Cham, Switzerland: Springer
Open this publication in new window or tab >>How to select the bandwidth in kernel equating — An evaluation of five different methods
2018 (English)In: Quantitative Psychology: The 82nd annual meeting of the psychometric society, Zurich, Switzerland, 2017 / [ed] Marie Wiberg, Steven Culpepper, Rianne Janssen, Jorge González, Dylan Molenaar, Cham, Switzerland: Springer , 2018, p. 91-100Chapter in book (Refereed)
Abstract [en]

When using kernel equating to equate two test forms, a bandwidth needs to be selected. The bandwidth parameter determines the smoothness of the continuized score distributions and has been shown to have a large effect on the kernel density estimate. There are a number of suggested criteria for selecting the bandwidth, and currently four of them have been implemented in kernel equating. In this paper, all four of the existing bandwidth selectors suggested for kernel equating are evaluated and compared against each other using real test data together with a new criterion that implements leave-one-out cross-validation. Although the bandwidth methods generally were similar in terms of equated scores, there were potentially important differences in the upper part of the score scale where critical admission decisions are typically made.

Place, publisher, year, edition, pages
Cham, Switzerland: Springer, 2018
Series
Springer Proceedings in Mathematics & Statistics, ISSN 2194-1009, E-ISSN 2194-1017 ; 233
Keywords
Kernel equating, Continuization, Bandwidth selection, Cross-validation
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:umu:diva-147091 (URN)10.1007/978-3-319-77249-3 (DOI)978-3-319-77248-6 (ISBN)978-3-319-77249-3 (ISBN)
Available from: 2018-04-26 Created: 2018-04-26 Last updated: 2018-06-09
Laukaityte, I. & Wiberg, M. (2018). Importance of sampling weights in multilevel modeling of international large-scale assessment data. Communications in Statistics - Theory and Methods, 47(20), 4991-5012
Open this publication in new window or tab >>Importance of sampling weights in multilevel modeling of international large-scale assessment data
2018 (English)In: Communications in Statistics - Theory and Methods, ISSN 0361-0926, E-ISSN 1532-415X, Vol. 47, no 20, p. 4991-5012Article in journal (Refereed) Published
Abstract [en]

Multilevel modeling is an important tool for analyzing large-scale assessment data. However, the standard multilevel modeling will typically give biased results for such complex survey data. This bias can be eliminated by introducing design weights which must be used carefully as they can affect the results. The aim of this paper is to examine different approaches and to give recommendations concerning handling design weights in multilevel models when analyzing large-scale assessments such as TIMSS (The Trends in International Mathematics and Science Study). To achieve the goal of the paper, we examined real data from two countries and included a simulation study. The analyses in the empirical study showed that using no weights or only level 1 weights sometimes could lead to misleading conclusions. The simulation study only showed small differences in estimation of the weighted and unweighted models when informative design weights were used. The use of unscaled or not rescaled weights however caused significant differences in some parameter estimates.

Place, publisher, year, edition, pages
Philadelphia: Taylor & Francis, 2018
Keywords
informative weights, two-stage sampling, rescaling weights, simulation study
National Category
Probability Theory and Statistics
Research subject
Statistics; Education
Identifiers
urn:nbn:se:umu:diva-128588 (URN)10.1080/03610926.2017.1383429 (DOI)000440044100006 ()
Funder
Swedish Research Council, 2015-02160
Available from: 2016-12-07 Created: 2016-12-07 Last updated: 2018-09-04Bibliographically approved
Wiberg, M., Ramsay, J. O. & Li, J. (2018). Optimal scores: an alternative to parametric item response theory and sum scores. Psychometrika
Open this publication in new window or tab >>Optimal scores: an alternative to parametric item response theory and sum scores
2018 (English)In: Psychometrika, ISSN 0033-3123, E-ISSN 1860-0980Article in journal (Refereed) Published
Abstract [en]

The aim of this paper is to discuss nonparametric item response theory scores in terms of optimal scores as an alternative to parametric item response theory scores and sum scores. Optimal scores take advantage of the interaction between performance and item impact that is evident in most testing data. The theoretical arguments in favor of optimal scoring are supplemented with the results from simulation experiments, and the analysis of test data suggests that sum-scored tests would need to be longer than an optimally scored test in order to attain the same level of accuracy. Because optimal scoring is built on a nonparametric procedure, it also offers a flexible alternative for estimating item characteristic curves that can fit items that do not show good fit to item response theory models.

Place, publisher, year, edition, pages
The Psychometric Society, 2018
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:umu:diva-152958 (URN)10.1007/s11336-018-9639-4 (DOI)
Funder
Swedish Research Council, 2014-578
Available from: 2018-10-31 Created: 2018-10-31 Last updated: 2018-11-09Bibliographically approved
Wiberg, M., Ramsay, J. O. & Li, J. (2018). Optimal Scores as an Alternative to Sum Scores. In: Marie Wiberg, Steven Culpepper, Rianne Janssen, Jorge González, & Dylan Molenaar (Ed.), Quantitative Psychology: The 82nd Annual Meetingof the Psychometric Society, Zurich,Switzerland, 2017 (pp. 1-10). Cham, Switzerland: Springer
Open this publication in new window or tab >>Optimal Scores as an Alternative to Sum Scores
2018 (English)In: Quantitative Psychology: The 82nd Annual Meetingof the Psychometric Society, Zurich,Switzerland, 2017 / [ed] Marie Wiberg, Steven Culpepper, Rianne Janssen, Jorge González, & Dylan Molenaar, Cham, Switzerland: Springer , 2018, p. 1-10Chapter in book (Refereed)
Abstract [en]

This paper discusses the use of optimal scores as an alternative to sum scores and expected sum scores when analyzing test data. Optimal scores are built on nonparametric methods and use the interaction between the test takers´ responses on each item and the impact of the corresponding items on the estimate of their performance. Both theoretical arguments for optimal score as well as arguments built upon simulation results are given. The paper claims that in order to achieve the same accuracy in terms of mean squared error and root mean squared error, an optimally scored test needs substantially fewer items than a sum scored test. The top-performing test takers and the bottom 5% test takers are by far the groups that benefit most from using optimal scores.

Place, publisher, year, edition, pages
Cham, Switzerland: Springer, 2018
Series
Springer Proceedings in Mathematics & Statistics, ISSN 2194-1009, E-ISSN 2194-1017 ; 233
Keywords
Optimal scoring, Item impact, Sum scores, Expected sum scores
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:umu:diva-147090 (URN)10.1007/978-3-319-77249-3 (DOI)978-3-319-77248-6 (ISBN)978-3-319-77249-3 (ISBN)
Available from: 2018-04-26 Created: 2018-04-26 Last updated: 2018-06-09
Wiberg, M., Culpepper, S., Janssen, R., González, J. & Molenaar, D. (Eds.). (2018). Quantitative Psychology: The 82nd annual meeting of the psychometric society, Zurich, Switzerland, 2017. Cham, Switzerland: Springer
Open this publication in new window or tab >>Quantitative Psychology: The 82nd annual meeting of the psychometric society, Zurich, Switzerland, 2017
Show others...
2018 (English)Conference proceedings (editor) (Refereed)
Place, publisher, year, edition, pages
Cham, Switzerland: Springer, 2018
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:umu:diva-147088 (URN)10.1007/978-3-319-77249-3 (DOI)978-3-319-77248-6 (ISBN)978-3-319-77249-3 (ISBN)
Available from: 2018-04-26 Created: 2018-04-26 Last updated: 2018-06-09
Sansivieri, V., Wiberg, M. & Matteucci, M. (2017). A review of test equating methods with a special focus on IRT-based approaches. Statistica, 77(4), 329-352
Open this publication in new window or tab >>A review of test equating methods with a special focus on IRT-based approaches
2017 (English)In: Statistica, ISSN 1973-2201, Vol. 77, no 4, p. 329-352Article, review/survey (Refereed) Published
Keywords
item-response theory, test equating
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:umu:diva-147094 (URN)
Available from: 2018-04-26 Created: 2018-04-26 Last updated: 2018-06-09Bibliographically approved
Ramsay, J. O. & Wiberg, M. (2017). A Strategy for Replacing Sum Scoring. Journal of educational and behavioral statistics, 42(3), 282-307
Open this publication in new window or tab >>A Strategy for Replacing Sum Scoring
2017 (English)In: Journal of educational and behavioral statistics, ISSN 1076-9986, E-ISSN 1935-1054, Vol. 42, no 3, p. 282-307Article in journal (Refereed) Published
Abstract [en]

This article promotes the use of modern test theory in testing situations where sum scores for binary responses are now used. It directly compares the efficiencies and biases of classical and modern test analyses and finds an improvement in the root mean squared error of ability estimates of about 5% for two designed multiple-choice tests and about 12% for a classroom test. A new parametric density function for ability estimates, the tilted scaled , is used to resolve the nonidentifiability of the univariate test theory model. Item characteristic curves (ICCs) are represented as basis function expansions of their log-odds transforms. A parameter cascading method along with roughness penalties is used to estimate the corresponding log odds of the ICCs and is demonstrated to be sufficiently computationally efficient that it can support the analysis of large data sets.

Place, publisher, year, edition, pages
Sage Publications, 2017
Keywords
parameter cascading, item characteristic curves, tilted scaled distribution, sum score distribution, performance manifold
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:umu:diva-136047 (URN)10.3102/1076998616680841 (DOI)000401153600003 ()
Available from: 2017-06-22 Created: 2017-06-22 Last updated: 2018-06-09Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-5549-8262

Search in DiVA

Show all publications