Umeå University's logo

umu.sePublications
System disruptions
We are currently experiencing disruptions on the search portals due to high traffic. We are working to resolve the issue, you may temporarily encounter an error message.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-6th-edition.csl
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Improved N-Best Extraction with an Evaluation on Language Data
Umeå University, Faculty of Science and Technology, Department of Computing Science.ORCID iD: 0000-0001-8503-0118
Umeå University, Faculty of Science and Technology, Department of Computing Science.ORCID iD: 0000-0001-7349-7693
Umeå University, Faculty of Science and Technology, Department of Computing Science.ORCID iD: 0000-0002-9873-4170
2022 (English)In: Computational linguistics - Association for Computational Linguistics (Print), ISSN 0891-2017, E-ISSN 1530-9312, Vol. 48, no 1, p. 119-153Article in journal (Refereed) Published
Abstract [en]

We show that a previously proposed algorithm for the N-best trees problem can be made more efficient by changing how it arranges and explores the search space. Given an integer N and a weighted tree automaton (wta) M over the tropical semiring, the algorithm computes N trees of minimal weight with respect to M. Compared with the original algorithm, the modifications increase the laziness of the evaluation strategy, which makes the new algorithm asymptotically more efficient than its predecessor. The algorithm is implemented in the software BETTY, and compared to the state-of-the-art algorithm for extracting the N best runs, implemented in the software toolkit TIBURON. The data sets used in the experiments are wtas resulting from real-world natural language processing tasks, as well as artificially created wtas with varying degrees of nondeterminism. We find that BETTY outperforms TIBURON on all tested data sets with respect to running time, while TIBURON seems to be the more memory-efficient choice.

Place, publisher, year, edition, pages
MIT Press, 2022. Vol. 48, no 1, p. 119-153
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:umu:diva-194338DOI: 10.1162/COLI_a_00427ISI: 000993785100004Scopus ID: 2-s2.0-85128188225OAI: oai:DiVA.org:umu-194338DiVA, id: diva2:1655914
Available from: 2022-05-04 Created: 2022-05-04 Last updated: 2023-09-05Bibliographically approved

Open Access in DiVA

fulltext(1867 kB)160 downloads
File information
File name FULLTEXT01.pdfFile size 1867 kBChecksum SHA-512
4b3207ec197e724e7702f8818f3b0263c294331c94b48c3e6aa36fea331039c2fe229537c41027ce58101d3e2b9b9c7978920dbb286f16841890a02bd681fa49
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Björklund, JohannaDrewes, FrankJonsson, Anna

Search in DiVA

By author/editor
Björklund, JohannaDrewes, FrankJonsson, Anna
By organisation
Department of Computing Science
In the same journal
Computational linguistics - Association for Computational Linguistics (Print)
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 160 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 398 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-6th-edition.csl
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf