Change search
ReferencesLink to record
Permanent link

Direct link
Textual information retrieval: An approach based on language modeling and neural networks
Umeå University, Faculty of Science and Technology, Applied Physics and Electronics.
2004 (English)Doctoral thesis, monograph (Other academic)
Abstract [en]

This thesis covers topics relevant to information organization and retrieval. The main objective of the work is to provide algorithms that can elevate the recall-precision performance of retrieval tasks in a wide range of applications ranging from document organization and retrieval to web-document pre-fetching and finally clustering of documents based on novel encoding techniques.

The first part of the thesis deals with the concept of document organization and retrieval using unsupervised neural networks, namely the self-organizing map, and statistical encoding methods for representing the available documents into numerical vectors. The objective of this section is to introduce a set of novel variants of the self-organizing map algorithm that addresses certain shortcomings of the original algorithm.

In the second part of the thesis the latencies perceived by users surfing the Internet are shortened with the usage of a novel transparent and speculative pre-fetching algorithm. The proposed algorithm relies on a model of behaviour for the user browsing the Internet and predicts his future actions when surfing the Internet. In modeling the users behaviour the algorithm relies on the contextual statistics of the web pages visited by the user.

Finally, the last chapter of the thesis provides preliminary theoretical results along with a general framework on the current and future scientific work. The chapter describes the usage of the Zipf distribution for document organization and the usage of the adaboosting algorithm for the elevation of the performance of pre-fetching algorithms.

Place, publisher, year, edition, pages
Umeå: Tillämpad fysik och elektronik , 2004. , 176 p.
Keyword [en]
Informatics, computer and systems science, Language modeling
Keyword [sv]
Informatik, data- och systemvetenskap
National Category
Computer and Information Science
URN: urn:nbn:se:umu:diva-252ISBN: 91-7305-623-5OAI: diva2:142818
Public defence
2004-04-15, Umeε, 13:00
Available from: 2004-04-29 Created: 2004-04-29Bibliographically approved

Open Access in DiVA

fulltext(5301 kB)4441 downloads
File information
File name FULLTEXT01.pdfFile size 5301 kBChecksum MD5
Type fulltextMimetype application/pdf

By organisation
Applied Physics and Electronics
Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 4441 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 1410 hits
ReferencesLink to record
Permanent link

Direct link