umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Syntactic methods for topic-independent authorship attribution
Umeå University, Faculty of Science and Technology, Department of Computing Science.
Umeå University, Faculty of Science and Technology, Department of Computing Science.
2017 (English)In: Natural Language Engineering, ISSN 1351-3249, E-ISSN 1469-8110, Vol. 23, no 5, 789-806 p.Article in journal (Refereed) Published
Abstract [en]

The efficacy of syntactic features for topic-independent authorship attribution is evaluated, taking a feature set of frequencies of words and punctuation marks as baseline. The features are 'deep' in the sense that they are derived by parsing the subject texts, in contrast to 'shallow' syntactic features for which a part-of-speech analysis is enough. The experiments are made on two corpora of online texts and one corpus of novels written around the year 1900. The classification tasks include classical closed-world authorship attribution, identification of separate texts among the works of one author, and cross-topic authorship attribution. In the first tasks, the feature sets were fairly evenly matched, but for the last task, the syntax-based feature set outperformed the baseline feature set. These results suggest that, compared to lexical features, syntactic features are more robust to changes in topic.

Place, publisher, year, edition, pages
CAMBRIDGE UNIV PRESS , 2017. Vol. 23, no 5, 789-806 p.
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:umu:diva-139621DOI: 10.1017/S1351324917000249ISI: 000407573100006OAI: oai:DiVA.org:umu-139621DiVA: diva2:1146847
Available from: 2017-10-04 Created: 2017-10-04 Last updated: 2017-10-04Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Björklund, JohannaZechner, Niklas
By organisation
Department of Computing Science
In the same journal
Natural Language Engineering
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

Altmetric score

Total: 12 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf