Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Adapting language specific components of cross-media analysis frameworks to less-resourced languages: the case of Amharic
Umeå University, Faculty of Science and Technology, Department of Computing Science. (Foundations of Language Processing)
Umeå University, Faculty of Science and Technology, Department of Computing Science. (Foundations of Language Processing)ORCID iD: 0000-0002-1112-2981
2020 (English)In: Proceedings of the 1st Joint SLTU and CCURL Workshop (SLTU-CCURL 2020) / [ed] Dorothee Beermann; Laurent Besacier; Sakriani Sakti; Claudia Soria, 2020, p. 298-305Conference paper, Published paper (Refereed)
Abstract [en]

We present an ASR based pipeline for Amharic that orchestrates NLP components within a cross media analysis framework (CMAF). One of the major challenges that are inherently associated with CMAFs is effectively addressing multi-lingual issues. As a result, many languages remain under-resourced and fail to leverage out of available media analysis solutions. Although spoken natively by over 22 million people and there is an ever-increasing amount of Amharic multimedia content on the Web, querying them with simple text search is difficult. Searching for, especially audio/video content with simple key words, is even hard as they exist in their raw form. In this study, we introduce a spoken and textual content processing workflow into a CMAF for Amharic. We design an ASR-named entity recognition (NER) pipeline that includes three main components: ASR, a transliterator and NER. We explore various acoustic modeling techniques and develop an OpenNLP-based NER extractor along with a transliterator that interfaces between ASR and NER. The designed ASR-NER pipeline for Amharic promotes the multi-lingual support of CMAFs. Also, the state-of-the art design principles and techniques employed in this study shed light for other less-resourced languages, particularly the Semitic ones.

Place, publisher, year, edition, pages
2020. p. 298-305
Keywords [en]
Speech recognition, named entity recognition, Less-resourced languages, Amharic, Cross-media analysis
National Category
Computer Sciences
Research subject
computational linguistics
Identifiers
URN: urn:nbn:se:umu:diva-170765ISBN: 979-10-95546-35-1 (print)OAI: oai:DiVA.org:umu-170765DiVA, id: diva2:1430423
Conference
Language Resources and Evaluation Conference (LREC 2020), Marseille, France, May 11–16, 2020
Available from: 2020-05-15 Created: 2020-05-15 Last updated: 2024-02-01Bibliographically approved
In thesis
1. NLP methods for improving user rating systems in crowdsourcing forums and speech recognition of less resourced languages
Open this publication in new window or tab >>NLP methods for improving user rating systems in crowdsourcing forums and speech recognition of less resourced languages
2024 (English)Doctoral thesis, comprehensive summary (Other academic)
Alternative title[sv]
NLP-metoder för att förbättra användarbetygssystem i crowdsourcing-forum och taligenkänning av språk med mindre resurser
Abstract [en]

We develop NLP and ASR methods (e.g., algorithms, architectures) for solving these problems: biases induced by user rating, ranking, recommendation and search engine algorithms, computational inefficiencies related with conventional syntactic-semantic parsing algorithms, extensive linguistic resource requirements imposed by traditional ASR methods, interoperability issues faced by NLP and ASR components within cross media analysis and audio-video content searchability problems over the Web. 

User rating systems (URSs) in crowdsourcing forums (CSFs) (e.g., QA) completely rely on solely voting schemes, and fail to incorporate linguistic quality and user competence information. Such potential failure affects the trustworthiness of answers over the Web as search engines are likely biased towards popular (high-voted) answers. That also contagiously affects the quality of the entire QA platforms as other components depend on the accuracy of the underlying URSs. On the other hand, conventional ASR methods present two major challenges: a failure of acoustic models to work within collaborative environments, as these methods only help build models limited to operate in isolation, and a resource related challenge. 

Significant contributions have been made in our thesis, published on prestigious AI, NLP and ASR venues, and received over 90 citations of our 9 papers. The proposed approaches potentially transform voting based rating to linguistic quality based rating, and shallow linguistic (meta-data) feature based answer quality predictions to deep syntactic-semantic and user competence based, and also single machine sequential fashion syntactic parsing to parallel and distributed cloud based parsing, meta-data based querying of spoken documents to full text querying and searching, as well as sentiment and competence based querying of textual content.

Theoretically, we advance the understanding of the relationships between author text and their associated proficiency in performing certain tasks through successive research works to discover the rules governing the conjectured link between them. Also, new bag of word approaches (based on latent topic modeling, syntactic categories and dependency relations) have been proposed. These approaches yield significant accuracy gains over conventional TF-IDF (term frequency–inverse document frequency) based models, and reduce domain dependencies as they potentially capture structural and topical information. 

Place, publisher, year, edition, pages
Umeå: Umeå University, 2024. p. 55
Series
Report / UMINF, ISSN 0348-0542 ; 23.09
Keywords
NLP-algorithms, speech-recognition, transfer-learning, syntax-semantics, computational linguistic model, Amharic-NLP, cloud-NLP architecture, question-answering, crowdsourcing, user-rating, less-resourced languages
National Category
Language Technology (Computational Linguistics)
Research subject
Computer Science
Identifiers
urn:nbn:se:umu:diva-220298 (URN)978-91-8070-261-4 (ISBN)978-91-8070-260-7 (ISBN)
Public defence
2024-02-29, NAT.D.360, Umeå University, 13:00 (English)
Opponent
Supervisors
Note

Wrong bibliographic information of Paper III, see errata sheet. 

Available from: 2024-02-08 Created: 2024-01-31 Last updated: 2024-02-05Bibliographically approved

Open Access in DiVA

fulltext(748 kB)12 downloads
File information
File name FULLTEXT02.pdfFile size 748 kBChecksum SHA-512
d11152fa70bf2426c91f4e0b54a38b78180387a8d63f4b459f92fd4a75c6417505dd3a8eaffa62b645602606385d5a92a73ee7fde3c3a846512e04e865c2e275
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Authority records

Woldemariam, Yonas DemekeDahlgren, Adam

Search in DiVA

By author/editor
Woldemariam, Yonas DemekeDahlgren, Adam
By organisation
Department of Computing Science
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 2071 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 321 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf