umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A comparison of two different methods for setting performance standards for a test with constructed-response items
Umeå University, Faculty of Social Sciences, Department of Educational Measurement.
Umeå University, Faculty of Social Sciences, Department of Educational Measurement.
2008 (English)In: Practical Assessment, Research & Evaluation, ISSN 1531-7714, E-ISSN 1531-7714, Vol. 13, no 9, 12- p.Article in journal (Refereed) Published
Abstract [en]

The trustworthiness of performance standards influences the credibility of criterion-referenced large-scale testing. In this paper, two standard-setting methods are evaluated and compared, when applied to a test with polytomously scored constructed-response items. A version of the Angoff method is chosen as representative of the class of test-centred standard-setting procedures and the borderline-group method represents the class of examinee-centred procedures. The evaluation is based on procedural, internal and external evidence. The results indicate that both methods provide reasonable and trustworthy approaches to standard setting, but also confirm some of the potential problems with these methods.

Place, publisher, year, edition, pages
College Park, Md.: ERIC Clearinghouse on Assessment and Evaluation and the Department of Measurement, Statistics, and Evaluation at the University of Maryland , 2008. Vol. 13, no 9, 12- p.
Identifiers
URN: urn:nbn:se:umu:diva-3509OAI: oai:DiVA.org:umu-3509DiVA: diva2:142243
Available from: 2008-09-30 Created: 2008-09-30 Last updated: 2017-12-14Bibliographically approved
In thesis
1. Measurement of alignment between standards and assessment
Open this publication in new window or tab >>Measurement of alignment between standards and assessment
2008 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Many educational systems of today are standards-based and aim at for alignment, i.e. consistency, among the components of the educational system: standards, teaching and assessment. To conclude whether the alignment is sufficiently high, analyses with a useful model are needed. This thesis investigates the usefulness of models for analyzing alignment between standards and assessments, with emphasis on one method: Bloom’s revised taxonomy. The thesis comprises an introduction and five articles that empirically investigate the usefulness of methods for alignment analyses.

In the first article, the usefulness of different models for analyzing alignment between standards and assessment is theoretically and empirically compared based on a number of criteria. The results show that Bloom’s revised taxonomy is the most useful model. The second article investigates the usefulness of Bloom’s revised taxonomy for interpretation of standards in mathematics with two differently composed panels of judges. One panel consisted of teachers and the other panel of assessment experts. The results show that Bloom’s revised taxonomy is useful for interpretation of standards, but that many standards are multi-categorized (placed in more than one category). The results also show higher levels of intra- and inter-judge consistency for assessment experts than for teachers. The third article further investigates the usefulness of Bloom’s revised taxonomy for analyses of alignment between standards and assessment. The results show that Bloom’s revised taxonomy is useful for analyses of both standards and assessments. The fourth article studies whether vague and general standards can explain the large proportion of multi-categorized standards in mathematics. The strategy was to divide a set of standards into smaller substandards and then compare the usefulness and inter-judge consistency for categorization with Bloom’s revised taxonomy for undivided and divided standards. The results show that vague and general standards do not explain the large proportion of multi-categorized standards. Another explanation is related to the nature of mathematics that often intertwines conceptual and procedural knowledge. This was also studied in the article and the results indicate that this is a probable explanation. The fifth article focuses on another aspect of alignment between standards and assessment, namely the alignment between performance standards and cut-scores for a specific assessment. The validity of two standard-setting methods, the Angoff method and the borderline-group method, was investigated. The results show that both methods derived reasonable and trustworthy cut-scores, but also that there are potential problems with these methods.

In the introductory part of the thesis, the empirical studies are summarized, contextualized and discussed. The discussion relates alignment to validity issues for assessments and relates the obtained empirical results to theoretical assumptions and applied implications. One conclusion of the thesis is that Bloom’s revised taxonomy is useful for analyses of alignment between standards and assessments. Another conclusion is that the two standard setting methods derive reasonable and trustworthy results. It is preferable if an alignment model can be used both for alignment analyses and in ongoing practice for increasing alignment. Bloom’s revised taxonomy has the potential for being such an alignment model. This thesis has found this taxonomy useful for alignment analyses, but its’ usefulness for increasing alignment in ongoing practice has to be investigated.

Place, publisher, year, edition, pages
Umeå: Beteendevetenskapliga mätningar, 2008. 226 p.
Series
Academic dissertations at the department of Educational Measurement, ISSN 1652-9650 ; 3
Keyword
alignment, standards, assessment, Bloom's revised taxonomy, the Angoff method, the borderline-group method, usefulness, validity
National Category
Manufacturing, Surface and Joining Technology
Identifiers
urn:nbn:se:umu:diva-1865 (URN)978-91-7264-662-9 (ISBN)
Public defence
2008-10-24, S205, Samhällsvetarhuset, Umeå, 10:15 (English)
Opponent
Supervisors
Available from: 2008-09-30 Created: 2008-09-30 Last updated: 2017-03-28Bibliographically approved

Open Access in DiVA

No full text

Other links

http://pareonline.net/pdf/v13n9.pdf

Search in DiVA

By author/editor
Näsström, GunillaNyström, Peter
By organisation
Department of Educational Measurement
In the same journal
Practical Assessment, Research & Evaluation

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 277 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf