umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Licence to drive: the importance of reliability for the validity of the Swedish driving licence test
Umeå University, Faculty of Social Sciences, Department of applied educational science, Departement of Educational Measurement.
2019 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Background: The Swedish driving licence test is a criterion-referenced test resulting in a pass or fail. It currently consists of two parts - a theory test with 65 multiple-choice items and a practical driving test where at least 25 minutes are spent driving in traffic. It is a high-stakes test in the sense that the results are used to determine whether the test-taker should be allowed to drive a car without supervision. As the only other requirements for obtaining a licence is a few hours of hazard education (and a short introduction if you intend to drive with a lay instructor) it is important that the test result, in terms of pass or fail, is reliable and valid. If this is not the case it could have detrimental effects on traffic safety. Examining all relevant aspects is beyond the scope of this licentiate thesis so I have focused on reliability.

Methods Reliability for both the theoretical and practical test results was examined. As these are very different types of tests the types of reliability examined also differed. In order to examine inter-rater reliability of the driving test 83 examiners were accompanied by one of five selected supervising examiners for a day of tests. All in all 535 tests were conducted with two examiners assessing the same performance. At the end of the day the examiners compared notes and tried to determine the reason for any inconsistencies. Both examiners and students also filled in questionnaires with questions about background and preparation. As for studying decision consistency and decision accuracy of the theory test, three test versions (a total of around 12,000 tests) were examined with the help of methods devised by Subkoviak (Subkoviak, 1976, 1988) and Hanson & Brennan (Brennan, 2004; Hanson & Brennan, 1990).

Results The results from two research studies concerning reliability were presented. Study I focused on inter-rater reliability in the driving test and in 93 per cent of cases the examiners made the same assessment. For the tests where their opinions differed there was no correlation to any of the background variables or other variables examined except for three, which had logical explanations and did not constitute a problem. Although there were cases where the differences were due to different stances on matters of interpretation the most common suggested cause was the placement in the car (back seat vs. front seat). Although the supervising examiners gave both praise and criticism as to how the test was carried out the study does not answer the question whether the tests were equal in terms of composition and difficulty.

In Study II the focus was on decision consistency and decision accuracy in the theory test. Three versions of the theory tests were examined and, on the whole, found to be fairly similar in terms of item difficulty and score distribution, but the mean was so close to the cut-score (i.e. the score required to pass) that the pass rate differed somewhat between versions. Agreement coefficients were around .80 for all test versions (between .79 and .82 depending on method). Classification accuracy indicated an .87 probability of a correct classification.

Conclusion It is important to examine the reliability and validity of the driving licence test since a misclassification can have serious consequences in terms of traffic safety. In the studies included here the rate of agreement between examiners is deemed as satisfactory. It would be preferable if the classification consistency and classification accuracy, as estimated by the methods used, were higher for the theory test, given its importance.

While reliability in terms of agreement between raters/examiners or consistency and accuracy of classification are routinely examined in other contexts, such as large-scale educational testing, this is not often done for the driving licence tests. At the same time, the methods used here can be transferred to contexts where such properties are generally not examined. Collecting information about test-takers and examiners, like in Study I, can provide evidence concerning possible bias.

Examining to what extent decisions are consistent is one important aspect of collecting evidence that shows that test results can be used to draw conclusions about driver competence. Still, regardless of outcome, validation is a process that never ends. There is always reason to examine various aspects and make further improvements. There are also many other relevant aspects to examine. A prerequisite for the validity of the score interpretation of a criterion-referenced test like this one is that the cut-score is appropriate and the content relevant. This should therefore be the subject of further research as the validation process continues.

Place, publisher, year, edition, pages
Umeå: Department of applied educational science, Educational measurement, Umeå university , 2019. , p. 56
Series
Academic dissertations at the department of Educational Measurement, ISSN 1652-9650 ; 12
Keywords [en]
Driving licence tests, driver's licence, driving test, theory test, licensing test, interrater reliability, classification consistency, examiner agreement, classification accuracy
Keywords [sv]
förarprov, körprov, kunskapsprov, reliabilitet, validitet, bedömare
National Category
Educational Sciences
Research subject
didactics of educational measurement
Identifiers
URN: urn:nbn:se:umu:diva-163949ISBN: 9789178551156 (print)OAI: oai:DiVA.org:umu-163949DiVA, id: diva2:1360362
Presentation
2019-10-25, Aulan, Vårdvetarhuset, Umeå, 10:00 (Swedish)
Opponent
Supervisors
Available from: 2019-10-14 Created: 2019-10-12 Last updated: 2019-10-14Bibliographically approved
List of papers
1. Agreement of driving examiners' assessments: evaluating the reliability of the Swedish driving test
Open this publication in new window or tab >>Agreement of driving examiners' assessments: evaluating the reliability of the Swedish driving test
2013 (English)In: Transportation Research Part F: Traffic Psychology and Behaviour, ISSN 1369-8478, E-ISSN 1873-5517, Vol. 19, p. 22-30Article in journal (Refereed) Published
Abstract [en]

The purpose of this study was to examine the consistency of examiner assessments of test-takers' performance on the Swedish driving test. The study included 535 tests and was designed so that the ordinary examiner and a supervising examiner assessed the same test-taker. The assessment was done on a two-grade rating scale (pass/fail). Since the result can be affected by factors associated with the test-taker and the two examiners, questionnaires were developed and these were filled in by the test-takers and the examiners. Information about the administration of the test was collected via a specially designed form filled in by the supervising examiner. Using this form, the ordinary examiners' performance was rated on a number of aspects. The result from the study indicated that the agreement between the assessments was very good. For 93% of the tests the two examiners chose the same mark on the two-grade scale. In the cases where ratings differed, the analysis indicated only a few systematic differences among variables designed to provide possible explanations for differences in opinion. However, none of these was problematic with respect to consistency of assessment. Results indicated that most tests were carried out in a satisfactory manner.

Place, publisher, year, edition, pages
Elsevier, 2013
Keywords
Driving test, Reliability, Driving examiner
National Category
Educational Sciences
Identifiers
urn:nbn:se:umu:diva-76245 (URN)10.1016/j.trf.2013.02.004 (DOI)000319542100003 ()
Available from: 2013-07-08 Created: 2013-07-08 Last updated: 2019-10-14Bibliographically approved
2. Is This Reliable Enough?: Examining Classification Consistency and Accuracy in a Criterion-Referenced Test
Open this publication in new window or tab >>Is This Reliable Enough?: Examining Classification Consistency and Accuracy in a Criterion-Referenced Test
2016 (English)In: International journal of assessment tools in education, ISSN 2148-7456, Vol. 3, no 2, p. 137-150Article in journal (Refereed) Published
Abstract [en]

One important step for assessing the quality of a test is to examine the reliability of test score interpretation. Which aspect of reliability is the most relevant depends on what type of test it is and how the scores are to be used. For criterion-referenced tests, and in particular certification tests, where students are classified into performance categories, primary focus need not be on the size of error but on the impact of this error on classification. This impact can be described in terms of classification consistency and classification accuracy. In this article selected methods from classical test theory for estimating classification consistency and classification accuracy were applied to the theory part of the Swedish driving licence test, a high-stakes criterion-referenced test which is rarely studied in terms of reliability of classification. The results for this particular test indicated a level of classification consistency that falls slightly short of the recommended level which is why lengthening the test should be considered. More evidence should also be gathered as to whether the placement of the cut-off score is appropriate since this has implications for the validity of classifications.

Place, publisher, year, edition, pages
International Journal of Assessment Tools in Education (IJATE), 2016
Keywords
reliability, criterion-referenced test, driving licence test, classification consistency, decision nsistency, single administration
National Category
Probability Theory and Statistics Educational Sciences
Identifiers
urn:nbn:se:umu:diva-139826 (URN)10.21449/ijate.245198 (DOI)000409392100003 ()
Available from: 2017-09-22 Created: 2017-09-22 Last updated: 2019-10-14Bibliographically approved

Open Access in DiVA

S Alger Lic(1058 kB)16 downloads
File information
File name FULLTEXT01.pdfFile size 1058 kBChecksum SHA-512
873492a840491ef7dd4593531eb8c7bde1b42c605b49d28fb57494bf4ba473c651b88a23cfba5075bd9a14ce2e401b34353d265374712eee6da880af6fc7fd83
Type fulltextMimetype application/pdf

Authority records BETA

Alger, Susanne

Search in DiVA

By author/editor
Alger, Susanne
By organisation
Departement of Educational Measurement
Educational Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 16 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 293 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf