umu.sePublications
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
When words are not enoughAn evaluation of character n-grams and function words in author identification of musical artists
Umeå University, Faculty of Science and Technology, Department of Computing Science.
2018 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

When we write texts we unconsciously leave prints behind, these prints are things such as the words used, punctuation, special characters and more. There are several different approaches to author identification that utilises these features. All these methods have been applied to avariety of texts, everything from papers to poems, e-mail and forum posts. This study will use lyrics where the artists are the authors, on these the performance of two common features will be compared.The two features that will get evaluated are character n-grams and function words. These are some of the most prominent features within author identification, where both have a track record of good performance. With high hopes for the performance the results showed that neither feature could reach the expected results. They were expected to achieve 70% and 65% accuracy respectively, however, the achieved average accuracy was only 40% and 35%. Even with the poor results some interesting finds were made. Some artists would have multiple band members write the songs which caused concern that it would affect the performance. Interestingly the results showed that multiple authors did not bad effect to the performance, in some cases they performed better than single authors.

Place, publisher, year, edition, pages
2018.
Series
UMNAD ; 1165
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:umu:diva-156137OAI: oai:DiVA.org:umu-156137DiVA, id: diva2:1286107
Educational program
Bachelor of Science Programme in Computing Science
Supervisors
Examiners
Available from: 2019-02-06 Created: 2019-02-06 Last updated: 2019-02-06Bibliographically approved

Open Access in DiVA

fulltext(195 kB)1 downloads
File information
File name FULLTEXT01.pdfFile size 195 kBChecksum SHA-512
799987aa95b946b00d5955be6c3e315f4d5e84e925b532d5d7a1366364f0a1f3b9701253472a138cfce06a465a00a7eedde62fb0c6b9588195a458d37dbde392
Type fulltextMimetype application/pdf

By organisation
Department of Computing Science
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 1 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 5 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf