Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Latent program modeling: Inferring latent problem-solving strategies from a PISA problem-solving task
Umeå University, Faculty of Social Sciences, Department of applied educational science.ORCID iD: 0000-0002-6998-3397
2022 (English)In: Journal of Educational Data Mining, E-ISSN 2157-2100, Vol. 14, no 1, p. 46-80Article in journal (Refereed) Published
Abstract [en]

Response process data have the potential to provide a rich description of test-takers’ thinking processes. However, retrieving insights from these data presents a challenge for educational assessments and educational data mining as they are complex and not well annotated. The present study addresses this challenge by developing a computational model that simulates how different problem-solving strategies would behave while searching for a solution to a Program for International Student Assessment (PISA) 2012 problem-solving item, and uses n-gram processing of data together with a naïve Bayesian classifier to infer latent problem-solving strategies from the test-takers’ response process data. The retrieval of simulated strategies improved with increased n-gram length, reaching an accuracy of 0.72 on the original PISA task. Applying the model to generalized versions of the task showed that classification accuracy increased with problem size and the mean number of actions, reaching a classification accuracy of 0.90 for certain task versions. The strategy that was most efficient and effective in the PISA Traffic task evaluated paths based on the labeled travel time. However, in generalized versions of the task, a straight line strategy was more effective. When applying the classifier to empirical data, most test-takers were classified as using a random path strategy (46%). Test-takers classified as using the travel time strategy had the highest probability of solving the task (p̂ ≈ 1). The test-takers classified as using the random actions strategy had the lowest probability of solving the task (p̂ ≈ 0.11). The effect of (classified) strategy on general PISA problem-solving performance was overall weak, except for a negative effect for the random actions strategy (β ≈ −65, CI95% ≈ [−96, −36]). The study contributes with a novel approach to inferring latent problem-solving strategies from action sequences. The study also illustrates how simulations can provide valuable information about item design by exploring how changing item properties could affect the accuracy of inferences about unobserved problem-solving strategies.

Place, publisher, year, edition, pages
International Educational Data Mining Society , 2022. Vol. 14, no 1, p. 46-80
Keywords [en]
computational cognitive modeling, educational assessment, PISA, problem-solving, process data
National Category
Applied Psychology
Identifiers
URN: urn:nbn:se:umu:diva-203575DOI: 10.5281/zenodo.6686443Scopus ID: 2-s2.0-85145815648OAI: oai:DiVA.org:umu-203575DiVA, id: diva2:1728715
Available from: 2023-01-19 Created: 2023-01-19 Last updated: 2023-04-19Bibliographically approved
In thesis
1. Exploring and modeling response process data from PISA: inferences related to motivation and problem-solving
Open this publication in new window or tab >>Exploring and modeling response process data from PISA: inferences related to motivation and problem-solving
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Alternative title[sv]
Modellering av responsprocessdata från PISA : inferenser relaterade till motivation och problemlösning
Abstract [en]

This thesis explores and models response process data from large-scale assessments, focusing on test-taking motivation, problem-solving strategies, and questionnaire response validity. It consists of four studies, all using data from PISA (Programme for International Student Assessment) data.

Study I processed and clustered log-file data to create a behavioral evaluation of students' effort applied to a PISA problem-solving item, and examined the relationship between students' behavioral effort, self-reported effort, and test performance. Results show that effort invested before leaving the task unsolved was positively related to performance, while effort invested before solving the tasks was not. Low effort before leaving the task unsolved was further related to lower self-reported effort. The findings suggest that test-taking motivation could only be validly measured from efforts exerted before giving up.

Study II used response process data to infer students' problem-solving strategies on a PISA problem-solving task, and investigated the efficiency of strategies and their relationship to PISA performance. A text classifier trained on data from a generative computational model was used to retrieve different strategies, reaching a classification accuracy of 0.72, which increased to 0.90 with item design changes. The most efficient strategies used information from the task environment to make plans. Test-takers classified as selecting actions randomly performed worse overall. The study concludes that computational modeling can inform score interpretation and item design.

Study III investigated the relationship between motivation to answer the PISA student questionnaire and test performance. Departing from the theory of satisficing in surveys a Bayesian finite mixture model was developed to assess questionnaire-taking motivation. Results showed that overall motivation was high, but decreased toward the end. The questionnaire-taking motivation was positively related to performance, suggesting that it could be a proxy for test-taking motivation, however, reading skills may affect the estimation.

Study IV examines the validity of composite scores assessing reading metacognition, using a Bayesian finite mixture model that jointly considers response times and sequential patterns in subitem responses. The results show that, the relatively high levels of satisficing (up to 30%) negatively biased composite scores. The study highlights the importance of considering response time data and subitem response patterns when the validity of scores from the student questionnaire.

In conclusion, response process data from international large-scale assessments can provide valuable insights into test-takers’ motivation, problem-solving strategies, and questionnaire validity.

Place, publisher, year, edition, pages
Umeå: Umeå University, 2023. p. 53
Series
Academic dissertations at the department of Educational Measurement, ISSN 1652-9650 ; 15
Keywords
response processes, large-scale assessments, motivation, problem-solving, computational modeling, Bayesian modeling
National Category
Other Social Sciences not elsewhere specified
Research subject
didactics of educational measurement
Identifiers
urn:nbn:se:umu:diva-206866 (URN)978-91-8070-058-0 (ISBN)978-91-8070-057-3 (ISBN)
Public defence
2023-05-17, Aula Biologica, Umeå, 10:00 (English)
Opponent
Supervisors
Available from: 2023-04-26 Created: 2023-04-19 Last updated: 2024-07-02Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Lundgren, Erik

Search in DiVA

By author/editor
Lundgren, Erik
By organisation
Department of applied educational science
Applied Psychology

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 124 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf