umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Efficient incremental evaluation of succinct regular expressions
Umeå University, Faculty of Science and Technology, Department of Computing Science. (Foundations of Language Processing)
Universität Bayreuth.
Universität Bayreuth.
2015 (English)In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, Association for Computing Machinery (ACM), 2015, 1541-1550 p.Conference paper, Published paper (Refereed)
Abstract [en]

Regular expressions are omnipresent in database applications. They form the structural core of schema languages for XML, they are a fundamental ingredient for navigational queries in graph databases, and are being considered in languages for upcoming technologies such as schema- and transformation languages for tabular data on the Web. In this paper we study the usage and effectiveness of the counting operator (or: limited repetition) in regular expressions. The counting operator is a popular extension which is part of the POSIX standard and therefore also present in regular expressions in grep, Java, Python, Perl, and Ruby. In a database context, expressions with counting appear in XML Schema and languages for querying graphs such as SPARQL 1.1 and Cypher.

We first present a practical study that suggests that counters are extensively used in practice. We then investigate evaluation methods for such expressions and develop a new algorithm for efficient incremental evaluation. Finally, we conduct an extensive benchmark study that shows that exploiting counting operators can lead to speed-ups of several orders of magnitude in a wide range of settings: normal and incremental evaluation on synthetic and real expressions.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2015. 1541-1550 p.
Keyword [en]
XML, Schema, Regular Expressions
National Category
Computer Science
Identifiers
URN: urn:nbn:se:umu:diva-110695DOI: 10.1145/2806416.2806434ISBN: 978-1-4503-3794-6 (print)OAI: oai:DiVA.org:umu-110695DiVA: diva2:864248
Conference
ACM International on Conference on Information and Knowledge Management (CIKM)2015, Melbourne, VIC, Australia
Available from: 2015-10-26 Created: 2015-10-26 Last updated: 2016-01-15Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Björklund, Henrik
By organisation
Department of Computing Science
Computer Science

Search outside of DiVA

GoogleGoogle Scholar

Altmetric score

Total: 86 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf