Umeå University's logo

umu.sePublications
System disruptions
We are currently experiencing disruptions on the search portals due to high traffic. We are working to resolve the issue, you may temporarily encounter an error message.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-6th-edition.csl
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
The space of models in machine learning: using Markov chains to model transitions
Umeå University, Faculty of Science and Technology, Department of Computing Science. Hamilton Institute, Maynooth University, Maynooth, Ireland; School of Informatics, University of Skövde, Skövde, Sweden.ORCID iD: 0000-0002-0368-8037
Umeå University, Faculty of Science and Technology, Department of Computing Science.
Dept. Information and Communications Engineering – CYBERCAT, Universitat Autònoma de Barcelona, Catalonia, Bellaterra, Spain.
2021 (English)In: Progress in Artificial Intelligence, ISSN 2192-6352, E-ISSN 2192-6360, Vol. 10, no 3, p. 321-332Article in journal (Refereed) Published
Abstract [en]

Machine and statistical learning is about constructing models from data. Data is usually understood as a set of records, a database. Nevertheless, databases are not static but change over time. We can understand this as follows: there is a space of possible databases and a database during its lifetime transits this space. Therefore, we may consider transitions between databases, and the database space. NoSQL databases also fit with this representation. In addition, when we learn models from databases, we can also consider the space of models. Naturally, there are relationships between the space of data and the space of models. Any transition in the space of data may correspond to a transition in the space of models. We argue that a better understanding of the space of data and the space of models, as well as the relationships between these two spaces is basic for machine and statistical learning. The relationship between these two spaces can be exploited in several contexts as, e.g., in model selection and data privacy. We consider that this relationship between spaces is also fundamental to understand generalization and overfitting. In this paper, we develop these ideas. Then, we consider a distance on the space of models based on a distance on the space of data. More particularly, we consider distance distribution functions and probabilistic metric spaces on the space of data and the space of models. Our modelization of changes in databases is based on Markov chains and transition matrices. This modelization is used in the definition of distances. We provide examples of our definitions.

Place, publisher, year, edition, pages
Springer, 2021. Vol. 10, no 3, p. 321-332
Keywords [en]
Hypothesis space, Machine and statistical learning models, Probabilistic metric spaces, Space of data, Space of models
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:umu:diva-183009DOI: 10.1007/s13748-021-00242-6ISI: 000639627000001Scopus ID: 2-s2.0-85104447939OAI: oai:DiVA.org:umu-183009DiVA, id: diva2:1555208
Funder
Swedish Research Council, 2016-03346Swedish Research Council, 2017-2020Knut and Alice Wallenberg FoundationAvailable from: 2021-05-18 Created: 2021-05-18 Last updated: 2024-06-25Bibliographically approved

Open Access in DiVA

fulltext(537 kB)100 downloads
File information
File name FULLTEXT02.pdfFile size 537 kBChecksum SHA-512
3c4566312597738df5e7d8100356f58f576b4821d506a048645cc21eb80a6192c5c918c88857ab24c6d8a08397d0ef4e1c21fce6cc2785722be6cc8dcb735ace
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

Torra, VicençTaha, Mariam

Search in DiVA

By author/editor
Torra, VicençTaha, Mariam
By organisation
Department of Computing Science
In the same journal
Progress in Artificial Intelligence
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 138 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 397 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-6th-edition.csl
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf