The curated UNESCO Courier 1.0: annotated corpora for digital research in the global humanitiesShow others and affiliations
2024 (English)In: Journal of Open Humanities Data, E-ISSN 2059-481X, Vol. 10, article id 20Article in journal (Refereed) Published
Abstract [en]
The monthly magazine of the United Nations Educational, Scientific and Cultural Organization, founded in 1948 as The UNESCO Courier, represents an extraordinary resource for research on global themes in the humanities. We present the Curated Courier 1.0, a package of digital text corpora, text analysis tools, and supplementary material that aims to make the complete archive of this publication from 1948 to 2020 machine-readable, accessible, and reusable for digital text analysis. One corpus compiles the text of all articles, which we carefully reconstructed and linked to a comprehensive curated metadata index while excluding additional text (masthead, photo captions, letters to the editor, and so on). A second corpus brings together the complete text of all issues. This article first presents the value of Courier as a source for digital research in the global humanities. Second, it outlines how we created the curated corpus and discusses some challenges we met. Third, it offers examples of tools researchers might use to explore and utilize the annotated corpus and discusses a few approaches that we have developed and tested.
Place, publisher, year, edition, pages
Ubiquity Press, 2024. Vol. 10, article id 20
Keywords [en]
global humanities, history, international organizations, text analysis, topic modeling, UNESCO
National Category
Natural Language Processing
Identifiers
URN: urn:nbn:se:umu:diva-222305DOI: 10.5334/johd.181ISI: 001208889500022Scopus ID: 2-s2.0-85186412888OAI: oai:DiVA.org:umu-222305DiVA, id: diva2:1844939
Funder
Swedish Research Council, 2019-032782024-03-152024-03-152026-01-19Bibliographically approved