umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Random databases with approximate record matching
Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics. Moscow MV Lomonosov State Univ, Fac Math & Mech, Moscow 119992, Russia. (Stochastic processes)
Kiel University, Institute of Informatics. (Databases)
2010 (English)In: Methodology and Computing in Applied Probability, ISSN 1387-5841, E-ISSN 1573-7713, Vol. 12, no 1, 63-89 p.Article in journal (Refereed) Published
Abstract [en]

In many database applications in   telecommunication, environmental and health sciences,  bioinformatics, physics, and econometrics, real-world data are uncertain  and subjected to errors. These data are processed, transmitted and stored in large databases. We consider stochastic modelling for databases with uncertain data and for some basic database operations (for example, join, selection) with exact and approximate matching.  Approximate join is used for merging or data deduplication in large databases. Distribution and mean of the join sizes are studied for random databases.  A random database is treated as a table with independent random records with a  common distribution (or a set of random tables). These results can be used for  integration of information from different databases, multiple join optimization, and various probabilistic algorithms for structured random data.

Place, publisher, year, edition, pages
Boston: Kluwer , 2010. Vol. 12, no 1, 63-89 p.
Keyword [en]
Random database, Join, Tests, Approximate matching, Rényi entropy, Poisson approximation
National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
URN: urn:nbn:se:umu:diva-30783DOI: 10.1007/s11009-008-9092-4ISI: 000273788900003OAI: oai:DiVA.org:umu-30783DiVA: diva2:286786
Available from: 2010-01-25 Created: 2010-01-15 Last updated: 2017-12-12Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Search in DiVA

By author/editor
Seleznjev, Oleg
By organisation
Department of Mathematics and Mathematical Statistics
In the same journal
Methodology and Computing in Applied Probability
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 91 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf