umu.sePublications
Change search
Refine search result
1 - 42 of 42
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Bensch, Suna
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Drewes, Frank
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Hellström, Thomas
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Grammatical Inference of Graph Transformation Rules2015In: Proceedings of the 7th Workshop on Non-Classical Modelsof Automata and Applications (NCMA 2015), Austrian Computer Society , 2015, p. 73-90Conference paper (Refereed)
  • 2.
    Berglund, Martin
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Björklund, Henrik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Drewes, Frank
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    On the Parameterized Complexity of Linear Context-Free Rewriting Systems2013In: Proceedings of the 13th Meeting on the Mathematics of Language (MoL 13), Association for Computational Linguistics, 2013, p. 21-29Conference paper (Other academic)
    Abstract [en]

    We study the complexity of uniform membership for Linear Context-Free RewritingSystems, i.e., the problem where we aregiven a string w and a grammar G and areasked whether w ∈ L(G). In particular,we use parameterized complexity theoryto investigate how the complexity dependson various parameters. While we focusprimarily on rank and fan-out, derivationlength is also considered.

  • 3.
    Björklund, Johanna
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Cleophas, Loek
    Stellenbosch University, Republic of South Africa.
    Karlsson, My
    Umeå University, Faculty of Science and Technology, High Performance Computing Center North (HPC2N). Codemill.
    An evaluation of structured language modeling for automatic speech recognition2017In: Journal of universal computer science (Online), ISSN 0948-695X, E-ISSN 0948-6968, Vol. 23, no 11, p. 1019-1034Article in journal (Refereed)
    Abstract [en]

    We evaluated probabilistic lexicalized tree-insertion grammars (PLTIGs) on a classification task relevant for automatic speech recognition. The baseline is a family of n-gram models tuned with Witten-Bell smoothing. The language models are trained on unannotated corpora, consisting of 10,000 to 50,000 sentences collected from the English section of Wikipedia. For the evaluation, an additional 150 random sentences were selected from the same source, and for each of these, approximately 3,200 variations were generated. Each variant sentence was obtained by replacing an arbitrary word by a similar word, chosen to be at most 2 character edits from the original. The evaluation task consisted of identifying the original sentence among the automatically constructed (and typically inferior) alternatives. In the experiments, the n-gram models outperformed the PLTIG model on the smaller data set, but as the size of data grew, the PLTIG model gave comparable results. While PLTIGs are more demanding to train, they have the advantage that they assign a parse structure to their input sentences. This is valuable for continued algorithmic processing, for example, for summarization or sentiment analysis.

  • 4.
    Björklund, Johanna
    et al.
    Umeå University.
    Cohen, Shay B.
    University of Edinburgh.
    Drewes, Frank
    Umeå University.
    Satta, Giorgio
    University of Padova.
    Bottom-Up Unranked Tree-to-Graph Transducers for Translation into Semantic Graphs2019In: Proceedings of the 14th International Conference on Finite-State Methods and Natural Language Processing / [ed] Heiko Vogler, Andreas Maletti, Association for Computational Linguistics, 2019, p. 7-17, article id W19-3104Conference paper (Refereed)
    Abstract [en]

    We propose a formal model for translating unranked syntactic trees, such as dependency trees, into semantic graphs. These tree-to-graph transducers can serve as a formal basis of transition systems for semantic parsing which recently have been shown to perform very well, yet hitherto lack formalization. Our model features "extended" rules and an arc-factored normal form, comes with an efficient translation algorithm, and can be equipped with weights in a straightforward manner.

  • 5.
    Björklund, Johanna
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Johansson Falck, Marlene
    Umeå University, Faculty of Arts, Department of language studies.
    How Spatial Relations Structure Linguistic Meaning2019In: Proceedings of the 15th SweCog Conference / [ed] Holm, Linus & Erik Billing, Skövde: University of Skövde , 2019, p. 29-31Conference paper (Refereed)
  • 6.
    Björklund, Johanna
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Zechner, Niklas
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Syntactic methods for topic-independent authorship attribution2017In: Natural Language Engineering, ISSN 1351-3249, E-ISSN 1469-8110, Vol. 23, no 5, p. 789-806Article in journal (Refereed)
    Abstract [en]

    The efficacy of syntactic features for topic-independent authorship attribution is evaluated, taking a feature set of frequencies of words and punctuation marks as baseline. The features are 'deep' in the sense that they are derived by parsing the subject texts, in contrast to 'shallow' syntactic features for which a part-of-speech analysis is enough. The experiments are made on two corpora of online texts and one corpus of novels written around the year 1900. The classification tasks include classical closed-world authorship attribution, identification of separate texts among the works of one author, and cross-topic authorship attribution. In the first tasks, the feature sets were fairly evenly matched, but for the last task, the syntax-based feature set outperformed the baseline feature set. These results suggest that, compared to lexical features, syntactic features are more robust to changes in topic.

  • 7.
    Chiang, David
    et al.
    University of Notre Dame.
    Drewes, Frank
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Gildea, Daniel
    University of Rochester.
    Lopez, Adam
    University of Edinburgh.
    Satta, Giorgio
    University of Padua.
    Weighted DAG automata for semantic graphs2018In: Computational linguistics - Association for Computational Linguistics (Print), ISSN 0891-2017, E-ISSN 1530-9312, Vol. 44, no 1, p. 119-186Article in journal (Refereed)
    Abstract [en]

    Graphs have a variety of uses in natural language processing, particularly as representations of linguistic meaning. A deficit in this area of research is a formal framework for creating, combining, and using models involving graphs that parallels the frameworks of finite automata for strings and finite tree automata for trees. A possible starting point for such a framework is the formalism of directed acyclic graph (DAG) automata, defined by Kamimura and Slutzki and extended by Quernheim and Knight. In this article, we study the latter in depth, demonstrating several new results, including a practical recognition algorithm that can be used for inference and learning with models defined on DAG automata. We also propose an extension to graphs with unbounded node degree and show that our results carry over to the extended formalism.

  • 8.
    Deutschmann, Mats
    et al.
    Umeå University, Faculty of Arts, Department of language studies.
    Molka-Danielsen, Judith
    Molde University, Norway.
    Future Directions for Learning in Virtual Worlds2009In: Learning and Teaching in the Virtual World of Second Life / [ed] Molka-Danielsen, J & M. Deutschmann, Trondheim: Tapir Academic Press , 2009, 1, p. 185-190Chapter in book (Refereed)
    Abstract [en]

    Some may claim that this book has been a showcase of case studies, without common thread. However, the common goal that runs through each of these cases is the focus on learning and the roles of learners and educators in learning activities. Do virtual worlds assist learning and do they create new opportunities? The answer from these analyses is “Yes” and this book demonstrates “how” to make use of the affordances of the virtual word of Second Life as it exists today. Yet, many questions remain both for practitioners and researchers. To give some examples: On what principles should learners’ tasks be designed, who are doing research on education in virtual worlds and what is the future of virtual worlds in a learning context? In this chapter we attempt to address some of these issues.

  • 9.
    Deutschmann, Mats
    et al.
    Umeå University, Faculty of Arts, Department of language studies.
    Panichi, Luisa
    Pisa University, Italy.
    Instructional Design: Teacher Practice and Learning Autonomy2009In: Learning and Teaching in the Virtual World of Second Life / [ed] Judith Molka-Danielsen & Mats Deutschmann, Trondheim: Tapir Academic Press , 2009, 1, p. 24-44Chapter in book (Refereed)
    Abstract [en]

    This chapter is based on the experiences from language proficiency courses given on Kamimo education island and addresses concerns related to teacher practice in Second Life. We examine preparatory issues, task design and the teacher’s role in fostering learner autonomy in Second Life. Although the chapter draws mainly on experiences from and reflections in the domain of language education, it has general pedagogical implications for teaching in SL.

  • 10.
    Deutschmann, Mats
    et al.
    Umeå University, Faculty of Arts, Department of language studies.
    Panichi, Luisa
    Pisa University, Itlay.
    Talking into empty space?: signalling involvement in a virtual language classroom in Second Life2009In: Language Awareness, ISSN 0965-8416, Vol. 18, no 3-4, p. 310-328Article in journal (Refereed)
    Abstract [en]

    In this study, we compare the first and the last sessions from an online oral proficiencycourse aimed at doctoral students conducted in the virtual world Second Life. The study attempts to identify how supportive moves made by the teacher encourage learners to engage with language, and what type of linguistic behaviour in the learners leads to engagement in others. We compare overall differences in terms of floor space and turn-taking patterns, and also conduct a more in-depth discourse analysis of parts of the sessions focusing on supportive moves such as back-channelling and elicitors. There are indications that the supportive linguistic behaviour of teachers is important in increasing learner engagement. In our studywe are also able to observe a change in student linguistic behaviour between the first and the last sessions with students becoming more active in signalling involvement as the course progresses. Finally, by illustrating some of the language awareness issues that arise in online environments, we hope to contribute to the understanding of the dynamics of online communication.

  • 11.
    Drewes, Frank
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Gebhardt, Kilian
    Technische Universität Dresden.
    Vogler, Heiko
    Technische Universität Dresden.
    EM-training for probabilistic aligned hypergraph bimorphisms2016In: Proceedings of the SIGFSM Workshop on Statistical NLP and Weighted Automata, Association for Computational Linguistics , 2016, p. 60-69Conference paper (Refereed)
    Abstract [en]

    We define the concept of probabilistic aligned hypergraph bimorphism. Each such bimorphism consists of a probabilistic regular tree grammar, two hypergraph algebras in which the generated trees are interpreted, and a family of alignments between the two interpretations. It generates a set of bihypergraphs each consisting of two hypergraphs and an alignment between them; for instance, discontinuous phrase structures and non-projective dependency structures are bihypergraphs. We show an EM-training algorithm which takes a corpus of bihypergraphs and an aligned hypergraph bimorphism as input and calculates a probability assignment to the rules of the regular tree grammar such that in the limit the maximum-likelihood of the corpus is approximated.

  • 12.
    Drewes, Frank
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Knight, Kevin
    University of Southern California.
    Kuhlmann, Marco
    Linköpings universitet.
    Formal Models of Graph Transformation in Natural Language Processing2015Report (Other academic)
    Abstract [en]

    In natural language processing (NLP) there is an increasing interest in formal models for processing graphs rather than more restricted structures such as strings or trees. Such models of graph transformation have previously been studied and applied in various other areas of computer science, including formal language theory, term rewriting, theory and implementation of programming languages, concurrent processes, and software engineering. However, few researchers from NLP are familiar with this work, and at the same time, few researchers from the theory of graph transformation are aware of the specific desiderata, possibilities and challenges that one faces when applying the theory of graph transformation to NLP problems. The Dagstuhl Seminar 15122 “Formal Models of Graph Transformation in Natural Language Processing” brought researchers from the two areas together. It initiated an interdisciplinary exchange about existing work, open problems, and interesting applications.

  • 13.
    Eriksson, Erik J.
    et al.
    Umeå University, Faculty of Arts, Philosophy and Linguistics.
    Rodman, Robert D.
    Dept. of Computer Science, NCSU, USA.
    Hubal, Robert C.
    Technology Assisted Learning Ctr., RTI International, USA.
    Emotions in speech: juristic implications2007In: Speaker Classification: Volume I, Berlin: Springer Verlag , 2007Chapter in book (Other academic)
    Abstract [en]

    This chapter focuses on the detection of emotion in speech and the impact that using technology to automate emotion detection would have within the legal system. The current states of the art for studies of perception and acoustics are described, and a number of implications for legal contexts are provided. We discuss, inter alia, assessment of emotion in others, witness credibility, forensic investigation, and training of law enforcement officers.

  • 14.
    Granberg, Johan
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Minock, Michael
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    A natural language interface over the MusicBrainz database2011In: Proceedings of the 1st workshop on Question Answering over Linked Data (QALD-1) / [ed] Christina Unger, Philipp Cimiano, Vanessa Lopez, Enrico Motta, 2011, p. 38-43Conference paper (Refereed)
    Abstract [en]

    This paper demonstrates a way to build a natural language interface (NLI) over semantically rich data. Specifically we show this over the MusicBrainz domain, inspired by the second shared task of the QALD-1 workshop. Our approach uses the tool C-Phrase [4] to build an NLI over a set of views defined over the original MusicBrainz relational database. C-Phrase uses a limited variant of X-Bar theory [3] for syntax and tuple calculus for semantics. The C-Phrase authoring tool works over any domain and only the end configuration has to be redone for each new database covered – a task that does not require deep knowledge about linguistics and system internals. Working over the MusicBrainz domain was a challenge due to the size of the database – quite a lot of effort went into optimizing computation times and memory usage to manageable levels. This paper reports on this work and anticipates a live demonstration for querying by the public

  • 15.
    Hansson, Britt
    Umeå University, Faculty of Social Sciences, Education.
    Större chans att klara det?: En specialpedagogisk studie av 10 ungdomars syn på hur datorstöd har påverkat deras språk, lärande och skolsituation.2008Independent thesis Basic level (professional degree), 10 credits / 15 HE creditsStudent thesis
    Abstract [sv]

    I studien intervjuades 10 ungdomar om sina erfarenheter av att använda dator med talsyntes och inspelade böcker. De tillfrågades om i vilka situationer verktygen har kommit till nytta eller upplevts hämmande i deras lärande och skolsituation. På grund av stora skolsvårigheter har ungdomarna fått låna en bärbar dator av skolan. Den har de använt både hemma och i skolan. Tillsammans med föräldrar och lärare har de fått handledning vid kommunens Skoldatatek. Att språket utvecklas när det används har varit utgångspunkt i studien, ur ett sociokulturellt perspektiv. Skolan ska erbjuda en tidsenlig utbildning och elever i skolsvårigheter har rätt att få stöd. Hur detta stöd ska utformas kan skapa ett dilemma på den enskilda skolan. Ett stöd riktat direkt till den enskilde kan nämligen uppfattas som att skolsvårigheter ses som en elevburen problematik, vilket inte får förekomma i ”en skola för alla”. Med tanke på detta dilemma var det viktigt att efterforska ungdomarnas upplevelser av stöd, utveckling och hinder, för att förstå om de orsakar utpekande och exkludering. Resultatet visade att ungdomarna upplevde att de kände sig mer motiverade med sina datorverktyg, som har kompenserat deras svårigheter och tilltalat deras olika lärstilar. Ungdomarna sade sig ha blivit säkrare skribenter och läsare tack vare ökat språkbruk. I deras berättelse framgår även nödvändigheten av stöd från lärare och föräldrar. Resultatet pekar på att alternativa verktyg i lärandet skulle kunna medverka till större måluppfyllelse i en skola för alla, med pedagogisk mångfald.

  • 16.
    Hendrick, Stephanie
    Umeå University, Faculty of Arts, Humlab. Umeå University, Faculty of Arts, Modern Languages. Engelska.
    Following Conversational Traces: Part 1: Creating a corpus with the ICWSM dataset.2007Conference paper (Refereed)
    Abstract [en]

    This poster will present the methodology behind the creation of a linguistic corpus based on a subset of the 2007 International Conference on Weblogs and Social Media dataset. Posts from a small group of political bloggers were tagged for parts of speech and indexed into a corpus using the program Xairia. From this corpus, the political blogger subset will be investigated for register and referential information. Referential information,especially with regards to new and given information, will be compared against network placement both to identify network innovators as well as to compare network placement as a catalyst for innovation. The final section, Further Research, will outline the modifications necessary for the creation of a full-scale corpus based on the entire ICWSM 2006 dataset.

  • 17.
    Jarlbrink, Johan
    et al.
    Umeå University, Faculty of Arts, Department of culture and media studies.
    Snickars, Pelle
    Umeå University, Faculty of Arts, Department of culture and media studies.
    Cultural heritage as digital noise: nineteenth century newspapers in the digital archive2017In: Journal of Documentation, ISSN 0022-0418, E-ISSN 1758-7379, Vol. 73, no 6, p. 1228-1243Article in journal (Refereed)
    Abstract [en]

    Purpose

    The purpose of this paper is to explore and analyze the digitized newspaper collection at the National Library of Sweden, focusing on cultural heritage as digital noise. In what specific ways are newspapers transformed in the digitization process? If the digitized document is not the same as the source document – is it still a historical record, or is it transformed into something else?

    Design/methodology/approach

    The authors have analyzed the XML files from Aftonbladet 1830 to 1862. The most frequent newspaper words not matching a high-quality references corpus were selected to zoom in on the noisiest part of the paper. The variety of the interpretations generated by optical character recognition (OCR) was examined, as well as texts generated by auto-segmentation. The authors have made a limited ethnographic study of the digitization process.

    Findings

    The research shows that the digital collection of Aftonbladet contains extreme amounts of noise: millions of misinterpreted words generated by OCR, and millions of texts re-edited by the auto-segmentation tool. How the tools work is mostly unknown to the staff involved in the digitization process? Sticking to any idea of a provenance chain is hence impossible, since many steps have been outsourced to unknown factors affecting the source document.

    Originality/value

    The detail examination of digitally transformed newspapers is valuable to scholars depending on newspaper databases in their research. The paper also highlights the fact that libraries outsourcing digitization processes run the risk of losing control over the quality of their collections.

  • 18.
    Lindgren, Eva
    et al.
    Umeå University, Faculty of Arts, Department of language studies.
    Sullivan, Kirk
    Umeå University, Faculty of Arts, Department of language studies.
    Zhao, Huahui
    Umeå University, Faculty of Arts, Department of language studies.
    Deutschmann, Mats
    Umeå University, Faculty of Arts, Department of language studies.
    Steinvall, Anders
    Umeå University, Faculty of Arts, Department of language studies.
    Developing Peer-to-Peer Supported Reflection as a Life-Long Learning Skill: an Example from the Translation Classroom2011In: Human Development and Global Advancements through Information Communication Technologies: New Initiatives / [ed] Susheel Chhabra & Hakikur Rahman, Hershey USA: IGI publishing , 2011, 1, p. 188-210Chapter in book (Refereed)
    Abstract [en]

    Life-long learning skills have moved from being a side-affect of a formal education to skills that are explicitly trained during a university degree. In a case study a University class undertook a translation from Swedish to English in a keystroke logging environment and then replayed their translations in pairs while discussing their thought processes when undertaking the translations, and why they made particular choices and changes to their translations. Computer keystroke logging coupled with Peerbased intervention assisted the students in discussing how they worked with their translations, enabled them to see how their ideas relating to the translation developed as they worked with the text, develop reflection skills and learn from their peers. The process showed that Computer Keystroke logging coupled with Peer-based intervention has to potential to (1) support student reflection and discussion around their translation tasks, (2) enhance student motivation and enthusiasm for translation and (3) develop peer-to-peer supported reflection as a life-long learning skill.

  • 19.
    Lindgren, Simon
    Umeå University, Faculty of Social Sciences, Department of Sociology.
    Introducing Connected Concept Analysis: A network approach to big text datasets2016In: Text & Talk, ISSN 1860-7330, E-ISSN 1860-7349, Vol. 36, no 3, p. 341-362Article in journal (Refereed)
    Abstract [en]

    This paper introduces Connected Concept Analysis (CCA) as a framework for text analysis which ties qualitative and quantitative considerations together in one unified model. Even though CCA can be used to map and analyze any full text dataset, of any size, the method was created specifically for taking the sensibilities of qualitative discourse analysis into the age of the Internet and big data. Using open data from a large online survey on habits and views relating to intellectual property rights, piracy and file sharing, I introduce CCA as a mixed-method approach aiming to bring out knowledge about corpuses of text, the sizes of which make it unfeasible to make comprehensive close readings. CCA aims to do this without reducing the text to numbers, as often becomes the case in content analysis. Instead of simply counting words or phrases, I draw on constant comparative coding for building concepts and on network analysis for connecting them. The result - a network graph visualization of key connected concepts in the analyzed text dataset - meets the need for text visualization systems that can support discourse analysis.

  • 20.
    Minock, Michael
    Umeå University, Faculty of Science and Technology, Department of Computing Science. KTH Royal Institute of Technology, Stockholm, Sweden.
    COVER: Covering the Semantically Tractable Question2017In: Proceedings of the Software Demonstrations of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2017: Demonstration session, 2017, p. 1-4Conference paper (Refereed)
    Abstract [en]

    In semantic parsing, natural language questions map to meaning representation language (MRL) expressions over some fixed vocabulary of predicates. To do this reliably, one must guarantee that for a wide class of natural language questions (the so called semantically tractable questions), correct interpretations are always in the mapped set of possibilities. Here we demonstrate the system COVER which significantly clarifies, revises and extends the notion of semantic tractability. COVER is written in Python and uses NLTK.

  • 21.
    Minock, Michael
    Umeå University, Faculty of Science and Technology, Department of Computing Science. KTH Royal Institute of Technology, Stockholm, Sweden.
    Evaluating an Automata Approach to Query Containment2017In: Proceedings of the 13th International Conference on Finite State Methods and Natural Language Processing (FSMNLP) / [ed] Frank Drewes, 2017, p. 75-79Conference paper (Refereed)
    Abstract [en]

    Given two queries Qsuper and Qsub, query containment is the problem of determining if Qsub(D) ⊆ Qsuper(D) for all databases D. This problem has long been explored, but to our knowledge no one has empirically evaluated a straightforward application of finite state automata to the problem. We do so here, covering the case of conjunctive queries with limited set conditions. We evaluate an implementation of our approach against straightforward implementations of both the canonical database and theorem proving approaches. Our implementation outperforms theorem proving on a natural language interface corpus over a photo/video domain. It also outperforms the canonical database implementation on single relation queries with large set conditions.

  • 22.
    Minock, Michael
    KTH Royal Institute of Technology, Stockholm, Sweden.
    In Pursuit of Decidable 'Logical Form'2014Conference paper (Refereed)
  • 23.
    Minock, Michael
    KTH Royal Institute of Technology.
    Using HOL Light to Reason over Second-Order MRLs2016Conference paper (Refereed)
  • 24.
    Minock, Michael
    et al.
    KTH.
    Mollevik, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Prediction and scheduling in navigation systems2013In: Proceedings of the GeoHCI Workshop: in conjunctionwith ACM CHI 2013, 2013, p. 30-32Conference paper (Refereed)
    Abstract [en]

    This position paper makes a case for the need to predict pedestrian position and schedule communication acts in mobile navigation systems. In our work, carried out in the context of a voice guided city navigation system, we have found that improperly timed route instructions are a major cause of failure in guiding pedestrians in unknown environments. Furthermore, the need to communicate other information while guiding users on routes, as well as complications caused by network latencies, occurs often enough to require that we be able synchronize communication acts with user position as they follow a route. This has led us to focus our efforts on scheduling utterances to maximize route following success.

    In this position paper we motivate this problem and present our initial approach and findings which should be of interest to others engaged in similar efforts in both the Geography and HCI communities.

  • 25.
    Minock, Michael
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Mollevik, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Åsander, Mattias
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Does TTS-based pedestrian navigation work?2014Report (Other academic)
    Abstract [en]

    We seek to test the hypothesis that text-to-speech(TTS) navigation systems can adequately guide pedestrians to unknown destinations in an unfamiliar city. Such systems bypass screenbased, multi-modal techniques and simply speak route following instructions incrementally into the pedestrian’s ear piece. Due to errors in GPS positioning, uncertainty of user heading, poor map quality and potential communication and processing latencies, this becomes a surprisingly challenging task. In our study, subjects are led on an unknown tour on the grounds of Ume˚a University. We evaluated both a human wizard controller as well as a simple decision-tree based controller and compared them to an ideal subject that knows the route. Results give support to our hypothesis that TTS-based navigation systems can adequately guide pedestrians. That said, our experiences point toward immediate and future improvements to make such systems more effective and agreeable. All the software and data behind this work will be open sourced to encourage confirmation, replication and, ultimately, improvement upon our results. This will soon be available for public download at http://janus-system.eu.

  • 26.
    Minock, Michael
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Mollevik, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Åsander, Mattias
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Toward an active database platform for guiding urban pedestrians2012Report (Other academic)
    Abstract [en]

    We present an Android-based platform for incrementally presenting spoken route directions to guide pedestrians to destinations. Our approach makes heavy use of stored procedures and triggers in an underlying PostGIS spatial database. In fact most of the 'intelligence' of our prototype resides in database stored procedures and tables. As such it represents an example of a challenging real world case study for the use of persistent stored modules (PSM) in a complex mobility application. It also provides a platform to study performance tradeoffs for complex event processing over spatial data streams.

  • 27.
    Molka-Danielsen, Judith
    et al.
    Molde University, Norway.
    Deutschmann, Mats
    Umeå University, Faculty of Arts, Department of language studies.
    Learning and Teaching in the Virtual World of Second Life2009 (ed. 1)Book (Refereed)
    Abstract [en]

    The book disseminates the experiences and lessons learned in various educational projects in Second Life.

  • 28.
    Mollevik, Johan
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Natural language interfaces over spatial data: investigations in scalability, extensibility and reliability2013Licentiate thesis, comprehensive summary (Other academic)
  • 29.
    Panichi, Luisa
    et al.
    Pisa University, Italy.
    Deutschmann, Mats
    Umeå University, Faculty of Arts, Department of language studies.
    Molka-Danielsen, Judith
    Molde University, Norway.
    Virtual Worlds for Language Learning and Intercultural Exchange: Is it for real?2010In: Telecollaboration 2.0 / [ed] Sarah Guth & Francesca Helm, Bern: Peter lang , 2010, 1, p. 165-198Chapter in book (Refereed)
    Abstract [en]

    Current debate in education suggests the need to promote learning contexts where learners can become increasingly active in the co-construction of knowledge within their learning community. Ironically, Second Life’s potential to simulate real-life face-to-face learning brings to the forefront traditional pedagogic concerns such as teacher/learner roles, methodology, syllabus and materials design and the validity of assessment procedures.

    The challenge for educators is thus to design tasks and promote teacher-learner interaction that encourage learner engagement, participation, autonomy and creativity within the practical constraints of SL and the administrative demands of the institutions we operate in.

    Based on research and experiences taken primarily from the domain of language learning, we argue that there are several issues related to task design and teacher practice that educators need to be aware of when designing/coordinating learning activities in SL

  • 30.
    Partonia, Saeed
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Automatic determination of major vocal types using support vector machnies2013In: Proceedings of Umeå's 16th student conference in computing science: USCCS 2013 / [ed] Suna Bensch & Frank Drewes, Umeå: Umeå universitet , 2013, p. 39-60Conference paper (Other academic)
    Abstract [en]

    This paper discusses the classification of basic vocal typesfor northern Swedish voices. The classification is carried out by use ofsupport vector machines. The paper aims to identify the important fea-tures of voice signal for classification. The paper also presents the resultsof applying the selected features in classification performance. Finally,those features that outperform others in classification are introduced.

  • 31.
    Tran, Son N.
    et al.
    The Australian E-Health Research Centre, CSIRO, Brisbane, QLD 4026, Australia.
    Zhang, Qing
    Nguyen, Anthony
    Vu, Xuan-Son
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Ngo, Son
    Improving Recurrent Neural Networks with Predictive Propagation for Sequence Labelling2018In: Neural Information Processing: 25th International Conference, ICONIP 2018, Siem Reap, Cambodia, December 13-16, 2018, Proceedings, Part I, Springer, 2018, p. 452-462Conference paper (Refereed)
    Abstract [en]

    Recurrent neural networks (RNNs) is a useful tool for sequence labelling tasks in natural language processing. Although in practice RNNs suffer a problem of vanishing/exploding gradient, their compactness still offers efficiency and make them less prone to overfitting. In this paper we show that by propagating the prediction of previous labels we can improve the performance of RNNs while keeping the number of parameters in RNNs unchanged and adding only one more step for inference. As a result, the models are still more compact and efficient than other models with complex memory gates. In the experiment, we evaluate the idea on optical character recognition and Chunking which achieve promising results.

  • 32.
    Vu, Thanh
    et al.
    Newcastle University.
    Nguyen, Dat Quoc
    The University of Melbourne.
    Vu, Xuan-Son
    Umeå University.
    Nguyen, Dai Quoc
    Deakin University.
    Catt, Michael
    Newcastle University.
    Trenell, Michael
    Newcastle University.
    NIHRIO at SemEval-2018 Task 3: A Simple and Accurate Neural Network Model for Irony Detection in Twitter2018In: Proceedings of The 12th International Workshop on Semantic Evaluation, New Orleans, Louisiana, USA: The Association for Computational Linguistics , 2018Conference paper (Refereed)
    Abstract [en]

    This paper describes our NIHRIO system for SemEval-2018 Task 3 "Irony detection in English tweets". We propose to use a simple neural network architecture of Multilayer Perceptron with various types of input features including: lexical, syntactic, semantic and polarity features.  Our system achieves very high performance in both subtasks of binary and multi-class irony detection in tweets. In particular, we rank at fifth in terms of the accuracy metric and the F1 metric. Our code is available at: https://github.com/NIHRIO/IronyDetectionInTwitter

  • 33.
    Vu, Xuan-Son
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Addi, Ait-Mlouk
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Lili, Jiang
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Graph-based Interactive Data Federation System for Heterogeneous Data Retrieval and Analytics2019In: Proceedings of The World Wide Web Conference WWW 2019, New York, NY, USA: ACM Digital Library, 2019, p. 3595-3599Conference paper (Refereed)
    Abstract [en]

    Given the increasing number of heterogeneous data stored in relational databases, file systems or cloud environment, it needs to be easily accessed and semantically connected for further data analytic. The potential of data federation is largely untapped, this paper presents an interactive data federation system (https://vimeo.com/ 319473546) by applying large-scale techniques including heterogeneous data federation, natural language processing, association rules and semantic web to perform data retrieval and analytics on social network data. The system first creates a Virtual Database (VDB) to virtually integrate data from multiple data sources. Next, a RDF generator is built to unify data, together with SPARQL queries, to support semantic data search over the processed text data by natural language processing (NLP). Association rule analysis is used to discover the patterns and recognize the most important co-occurrences of variables from multiple data sources. The system demonstrates how it facilitates interactive data analytic towards different application scenarios (e.g., sentiment analysis, privacyconcern analysis, community detection).

  • 34.
    Vu, Xuan-Son
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Jiang, Lili
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Brändström, Anders
    Umeå University, Faculty of Social Sciences, Centre for Demographic and Ageing Research (CEDAR).
    Elmroth, Erik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Personality-based Knowledge Extraction for Privacy-preserving Data Analysis2017In: K-CAP 2017 - Proceedings of the Knowledge Capture Conference, Austin, TX, USA: ACM Digital Library, 2017, article id 45Conference paper (Refereed)
    Abstract [en]

    In this paper, we present a differential privacy preserving approach, which extracts personality-based knowledge to serve privacy guarantee data analysis on personal sensitive data. Based on the approach, we further implement an end-to-end privacy guarantee system, KaPPA, to provide researchers iterative data analysis on sensitive data. The key challenge for differential privacy is determining a reasonable amount of privacy budget to balance privacy preserving and data utility. Most of the previous work applies unified privacy budget to all individual data, which leads to insufficient privacy protection for some individuals while over-protecting others. In KaPPA, the proposed personality-based privacy preserving approach automatically calculates privacy budget for each individual. Our experimental evaluations show a significant trade-off of sufficient privacy protection and data utility.

  • 35.
    Vu, Xuan-Son
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Lili, Jiang
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Self-adaptive Privacy Concern Detection for User-generated Content2018In: Proceedings of the 19th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing), 2018, Cornell University Library, arXiv.org , 2018Conference paper (Other academic)
    Abstract [en]

    To protect user privacy in data analysis, a state-of-the-art strategy is differential privacy in which scientific noise is injected into the real analysis output. The noise masks individual’s sensitive information contained in the dataset. However, determining the amount of noise is a key challenge, since too much noise will destroy data utility while too little noise will increase privacy risk. Though previous research works have designed some mechanisms to protect data privacy in different scenarios, most of the existing studies assume uniform privacy concerns for all individuals. Consequently, putting an equal amount of noise to all individuals leads to insufficient privacy protection for some users, while over-protecting others. To address this issue, we propose a self-adaptive approach for privacy concern detection based on user personality. Our experimental studies demonstrate the effectiveness to address a suitable personalized privacy protection for cold-start users (i.e., without their privacy-concern information in training data).

  • 36.
    Vu, Xuan-Son
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Santra, Abhishek
    Chakravarthy, Sharma
    Lili, Jiang
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Generic Multilayer Network Data Analysis with the Fusion of Content and Structure2019In: Proceedings of the 20th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing), 2019, Cornell University Library, arXiv.org , 2019Conference paper (Refereed)
    Abstract [en]

    Multi-feature data analysis (e.g., on Facebook, LinkedIn) is challenging especially if one wants to do it efficiently and retain the flexibility by choosing features of interest for analysis. Features (e.g., age, gender, relationship, political view etc.) can be explicitly given from datasets, but also can be derived from content (e.g., political view based on Facebook posts). Analysis from multiple perspectives is needed to understand the datasets (or subsets of it) and to infer meaningful knowledge. For example, the influence of age, location, and marital status on political views may need to be inferred separately (or in combination). In this paper, we adapt multilayer network (MLN) analysis, a nontraditional approach, to model the Facebook datasets, integrate content analysis, and conduct analysis, which is driven by a list of desired application based queries. Our experimental analysis shows the flexibility and efficiency of the proposed approach when modeling and analyzing datasets with multiple features.

  • 37.
    Vu, Xuan-Son
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Tran, Son N.
    Jiang, Lili
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    dpUGC: Learn Differentially Private Representation for User Generated Contents2019In: Proceedings of the 20th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing), 2019, Cornell University Library, arXiv.org , 2019Conference paper (Refereed)
    Abstract [en]

    This paper firstly proposes a simple yet efficient generalized approach to apply differential privacy to text representation (i.e., word embedding). Based on it, we propose a user-level approach to learn personalized differentially private word embedding model on user generated contents (UGC). To our best knowledge, this is the first work of learning user-level differentially private word embedding model from text for sharing. The proposed approaches protect the privacy of the individual from re-identification, especially provide better trade-off of privacy and data utility on UGC data for sharing. The experimental results show that the trained embedding models are applicable for the classic text analysis tasks (e.g., regression). Moreover, the proposed approaches of learning differentially private embedding models are both framework- and dataindependent, which facilitates the deployment and sharing. The source code is available at https://github.com/sonvx/dpText.

  • 38.
    Woldemariam, Yonas
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Predicting User Competence from Text2017In: The 21st world multi-conference on systemics, cybernetics and informatics: proceedings : volume 1 / [ed] Nagib Callaos, Belkis Sánches, Michael Savoie, Andrés Tremante, International Institute of Informatics and Systemics, 2017, p. 147-152Conference paper (Refereed)
    Abstract [en]

    We explore the possibility of learning user competence from a text by using natural language processing and machine learning (ML) methods. In our context, competence is defined as the ability to identify the wildlife appearing in images and classifying into species correctly. We evaluate and compare the performance (regarding accuracy and F-measure) of the three ML methods, Naive Bayes (NB), Decision Trees (DT) and K-nearest neighbors (KNN), applied to the text corpus obtained from the Snapshot Senrengeti discussion forum posts. The baseline results show, that regarding accuracy, DT outperforms NB and KNN by 16.00%, and 15.00% respectively. Regarding F-measure, K-NN outperforms NB and DT by 12.08% and 1.17%, respectively. We also propose a hybrid model that combines the three models (DT, NB and KNN). We improve the baseline results with the calibration technique and additional features. Adding a bi-gram feature has shown a dramatic increase (from 48.38% to 64.40%) of accuracy for NB model. We achieved to push the accuracy limit in the baseline models from 93.39% to 94.09%

  • 39.
    Woldemariam, Yonas Demeke
    et al.
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Bensch, Suna
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Björklund, Henrik
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Predicting User Competence from Linguistic Data2017In: Proceedings of the 14th International Conference on Natural Language Processing (ICON-2017) / [ed] Sivaji Bandyopadhyay, Jadavpur University , 2017, p. 476-484Conference paper (Refereed)
    Abstract [en]

    We investigate the problem of predicting the competence of users of the crowd-sourcing platform Zooniverse by analyzing their chat texts. Zooniverse is an online platform where objects of different types are displayed to volunteer users to classify. Our research focuses on the Zoonivers Galaxy Zoo project, where users classify the images of galaxies and discuss their classifications in text. We apply natural language processing methods to extract linguistic features including syntactic categories, bag-of-words, and punctuation marks. We trained three supervised machine-learning classifiers on the resulting dataset: k-nearest neighbors, decision trees (with gradient boosting) and naive Bayes. They are evaluated (regarding accuracy and F-measure) with two different but related domain datasets. The performance of the classifiers varies across the feature set configurations designed during the training phase. A challenging part of this research is to compute the competence of the users without ground truth data available. We implemented a tool that estimates the proficiency of users and annotates their text with computed competence. Our evaluation results show that the trained classifier models give results that are significantly better than chance and can be deployed for other crowd-sourcing projects as well. 

  • 40. Yan, Xiaoyong
    et al.
    Yang, Seong.Gyu
    Kim, Beom Jun
    Minnhagen, Petter
    Umeå University, Faculty of Science and Technology, Department of Physics.
    Benford's Law and the First Letter of Words2018In: Physica A: Statistical Mechanics and its Applications, ISSN 0378-4371, E-ISSN 1873-2119, Vol. 512, p. 305-315Article in journal (Other academic)
    Abstract [en]

    A universal First-Letter Law (FLL) is derived and described. It predicts the percentages of first letters for words in novels. The FLL is akin to Benford’s law (BL) of first digits, which predicts the percentages of first digits in a data collection of numbers. Both are universal in the sense that FLL only depends on the numbers of letters in the alphabet, whereas BL only depends on the number of digits in the base of the number system. The existence of these types of universal laws appears counter-intuitive. Nonetheless both describe data very well. Relations to some earlier works are given. FLL predicts that an English author on the average starts about 16 out of 100 words with the English letter ‘t’. This is corroborated by data, yet an author can freely write anything. Fuller implications and the applicability of FLL remain for the future.

  • 41.
    Zechner, Niklas
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    A novel approach to text classification2017Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    This thesis explores the foundations of text classification, using both empirical and deductive methods, with a focus on author identification and syntactic methods. We strive for a thorough theoretical understanding of what affects the effectiveness of classification in general. 

    To begin with, we systematically investigate the effects of some parameters on the accuracy of author identification. How is the accuracy affected by the number of candidate authors, and the amount of data per candidate? Are there differences in how methods react to the changes in parameters? Using the same techniques, we see indications that methods previously thought to be topic-independent might not be so, but that syntactic methods may be the best option for avoiding topic dependence. This means that previous studies may have overestimated the power of lexical methods. We also briefly look for ways of spotting which particular features might be the most effective for classification. Apart from author identification, we apply similar methods to identifying properties of the author, including age and gender, and attempt to estimate the number of distinct authors in a text sample. In all cases, the techniques are proven viable if not overwhelmingly accurate, and we see that lexical and syntactic methods give very similar results. 

    In the final parts, we see some results of automata theory that can be of use for syntactic analysis and classification. First, we generalise a known algorithm for finding a list of the best-ranked strings according to a weighted automaton, to doing the same with trees and a tree automaton. This result can be of use for speeding up parsing, which often runs in several steps, where each step needs several trees from the previous as input. Second, we use a compressed version of deterministic finite automata, known as failure automata, and prove that finding the optimal compression is NP-complete, but that there are efficient algorithms for finding good approximations. Third, we find and prove the derivatives of regular expressions with cuts. Derivatives are an operation on expressions to calculate the remaining expression after reading a given symbol, and cuts are an extension to regular expressions found in many programming languages. Together, these findings may be able to improve on the syntactic analysis which we have seen is a valuable tool for text classification.

  • 42.
    Zechner, Niklas
    Umeå University, Faculty of Science and Technology, Department of Computing Science.
    Derivatives of regular expressions with cuts2017Report (Other academic)
    Abstract [en]

    Derivatives of regular expressions are an operation which for a given expression produces an expression for what remains after a specific symbol has been read. This can be used as a step in transforming an expression into a finite string automaton. Cuts are an extension of the ordinary regular expressions; the cut operator is essentially a concatenation without backtracking, formalising a behaviour found in many programming languages. Just as for concatenation, we can also define an iterated cut operator. We show and derive expressions for the derivatives of regular expressions with cuts and iterated cuts.

1 - 42 of 42
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf