Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Shortcut Transformers and the Learnability of Automata
Umeå University, Faculty of Science and Technology, Department of Computing Science.
2023 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Transformers have emerged as a powerful architecture for various tasks in natural language processing, computer vision, and multi-modal domains. Despite their success, understanding the computational capabilities and limitations of transformers remains a challenge. This work focuses on relating transformers to deterministic finite automata (DFAs) and empirically investigates the architecture's ability to simulate DFAs of varying complexities. We empirically explore the simulation of DFAs by transformers, specifically focusing on the solvable A4-DFA and the non-solvable A5-DFA. We conduct experiments to evaluate the in-distribution and out-of-distribution accuracy of sub-linear depth transformers with positive results on both accounts. Additionally, we examine the impact of widening the transformer to find even shallower transformers for the  A4-DFA. While no significant improvements are observed compared to the sub-linear depth transformers, further exploration of hyperparameters is needed for more reliable results.

Place, publisher, year, edition, pages
2023. , p. 53
Series
UMNAD ; 1404
Keywords [en]
transformers, modern neural networks, automata, expressivity, learnability, regular languages, solvable groups
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:umu:diva-210546OAI: oai:DiVA.org:umu-210546DiVA, id: diva2:1773419
Educational program
Master's Programme in Computing Science
Supervisors
Examiners
Available from: 2023-06-27 Created: 2023-06-22 Last updated: 2023-06-27Bibliographically approved

Open Access in DiVA

fulltext(1741 kB)436 downloads
File information
File name FULLTEXT01.pdfFile size 1741 kBChecksum SHA-512
d567bc60ea8977d1bfc2111f85c3d85bec2cd18fe0bd35a5a5b57e2740f99cb75c7d6c9accd3392c2ec868dcc79d237de1334f63d4fc85fd67119560ef270047
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Martens, Willeke
By organisation
Department of Computing Science
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 436 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 644 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf