Umeå University's logo

umu.sePublikasjoner
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Nonparametric bagging clustering methods to identify latent structures from a sequence of dependent categorical data
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för matematik och matematisk statistik.ORCID-id: 0000-0002-9040-6674
Umeå universitet, Teknisk-naturvetenskapliga fakulteten, Institutionen för matematik och matematisk statistik.ORCID-id: 0000-0003-1591-5716
Umeå universitet, Samhällsvetenskapliga fakulteten, Handelshögskolan vid Umeå universitet, Statistik.ORCID-id: 0000-0003-1098-0076
2023 (engelsk)Inngår i: Computational Statistics & Data Analysis, ISSN 0167-9473, E-ISSN 1872-7352, Vol. 177, artikkel-id 107583Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Nonparametric bagging clustering methods are studied and compared to identify latent structures from a sequence of dependent categorical data observed along a one-dimensional (discrete) time domain. The frequency of the observed categories is assumed to be generated by a (slowly varying) latent signal, according to latent state-specific probability distributions. The bagging clustering methods use random tessellations (partitions) of the time domain and clustering of the category frequencies of the observed data in the tessellation cells to recover the latent signal, within a bagging framework. New and existing ways of generating the tessellations and clustering are discussed and combined into different bagging clustering methods. Edge tessellations and adaptive tessellations are the new proposed ways of forming partitions. Composite methods are also introduced, that are using (automated) decision rules based on entropy measures to choose among the proposed bagging clustering methods. The performance of all the methods is compared in a simulation study. From the simulation study it can be concluded that local and global entropy measures are powerful tools in improving the recovery of the latent signal, both via the adaptive tessellation strategies (local entropy) and in designing composite methods (global entropy). The composite methods are robust and overall improve performance, in particular the composite method using adaptive (edge) tessellations.

sted, utgiver, år, opplag, sider
Elsevier, 2023. Vol. 177, artikkel-id 107583
Emneord [en]
Bagging methods, Categorical dependent data, Clustering, Entropy
HSV kategori
Forskningsprogram
statistik
Identifikatorer
URN: urn:nbn:se:umu:diva-198931DOI: 10.1016/j.csda.2022.107583ISI: 000930488900007Scopus ID: 2-s2.0-85135796679OAI: oai:DiVA.org:umu-198931DiVA, id: diva2:1696677
Forskningsfinansiär
Swedish Research Council, 340-2013-5203Tilgjengelig fra: 2022-09-19 Laget: 2022-09-19 Sist oppdatert: 2024-08-15bibliografisk kontrollert

Open Access i DiVA

fulltext(2302 kB)327 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 2302 kBChecksum SHA-512
09350dcc74db5a48b5ec6c184e4e9681efa069fb2a58d4ab5e16989c9814d5c1dd2ab23246c75f1be3bb083aa542c7444f6d4048fbd11da8bb6e9095fe2c0883
Type fulltextMimetype application/pdf

Andre lenker

Forlagets fulltekstScopus

Person

Abramowicz, KonradSjöstedt de Luna, SaraStrandberg, Johan

Søk i DiVA

Av forfatter/redaktør
Abramowicz, KonradSjöstedt de Luna, SaraStrandberg, Johan
Av organisasjonen
I samme tidsskrift
Computational Statistics & Data Analysis

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 327 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 1444 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf