umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Predicting User Competence from Text
Umeå University, Faculty of Science and Technology, Department of Computing Science. (Formal and Natural Language)
2017 (English)In: The 21st world multi-conference on systemics, cybernetics and informatics: proceedings : volume 1 / [ed] Nagib Callaos, Belkis Sánches, Michael Savoie, Andrés Tremante, International Institute of Informatics and Systemics, 2017, p. 147-152Conference paper, Published paper (Refereed)
Abstract [en]

We explore the possibility of learning user competence from a text by using natural language processing and machine learning (ML) methods. In our context, competence is defined as the ability to identify the wildlife appearing in images and classifying into species correctly. We evaluate and compare the performance (regarding accuracy and F-measure) of the three ML methods, Naive Bayes (NB), Decision Trees (DT) and K-nearest neighbors (KNN), applied to the text corpus obtained from the Snapshot Senrengeti discussion forum posts. The baseline results show, that regarding accuracy, DT outperforms NB and KNN by 16.00%, and 15.00% respectively. Regarding F-measure, K-NN outperforms NB and DT by 12.08% and 1.17%, respectively. We also propose a hybrid model that combines the three models (DT, NB and KNN). We improve the baseline results with the calibration technique and additional features. Adding a bi-gram feature has shown a dramatic increase (from 48.38% to 64.40%) of accuracy for NB model. We achieved to push the accuracy limit in the baseline models from 93.39% to 94.09%

Place, publisher, year, edition, pages
International Institute of Informatics and Systemics, 2017. p. 147-152
Keywords [en]
text analysis, NLP, machine-learning, naive bayes, decision trees, K-nearest neighbors
National Category
Language Technology (Computational Linguistics)
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:umu:diva-138291ISBN: 978-1-941763-59-9 (print)OAI: oai:DiVA.org:umu-138291DiVA, id: diva2:1133833
Conference
21st World Multi-Conference on Systemics, Cybernetics and Informatics (WMSCI 2017), Orlando, Florida, USA, July 8-11, 2017
Available from: 2017-08-17 Created: 2017-08-17 Last updated: 2018-06-09Bibliographically approved
In thesis
1.
The record could not be found. The reason may be that the record is no longer available or you may have typed in a wrong id in the address field.

Open Access in DiVA

No full text in DiVA

Other links

URL

Authority records BETA

Woldemariam, Yonas

Search in DiVA

By author/editor
Woldemariam, Yonas
By organisation
Department of Computing Science
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 177 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf