umu.sePublications
Change search
Link to record
Permanent link

Direct link
BETA
Källberg, David
Publications (7 of 7) Show all publications
Källberg, D., Belyaev, Y. & Rydén, P. (2018). A moment-distance hybrid method for estimating a mixture of two symmetric densities. Modern Stochastics: Theory and Applications, 5(1), 1-36
Open this publication in new window or tab >>A moment-distance hybrid method for estimating a mixture of two symmetric densities
2018 (English)In: Modern Stochastics: Theory and Applications, ISSN 2351-6054, Vol. 5, no 1, p. 1-36Article in journal (Refereed) Published
Abstract [en]

In clustering of high-dimensional data a variable selection is commonly applied to obtain an accurate grouping of the samples. For two-class problems this selection may be carried out by fitting a mixture distribution to each variable. We propose a hybrid method for estimating a parametric mixture of two symmetric densities. The estimator combines the method of moments with the minimum distance approach. An evaluation study including both extensive simulations and gene expression data from acute leukemia patients shows that the hybrid method outperforms a maximum-likelihood estimator in model-based clustering. The hybrid estimator is flexible and performs well also under imprecise model assumptions, suggesting that it is robust and suited for real problems.

Keywords
inference for mixtures, method of moments, minimum distance, model-based clustering
National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
urn:nbn:se:umu:diva-144644 (URN)10.15559/17-VMSTA93 (DOI)000434875200001 ()
Funder
Swedish Research Council, 340-2013-5185
Available from: 2018-02-08 Created: 2018-02-08 Last updated: 2018-09-19Bibliographically approved
patrik, R., Källberg, D. & Belyaev, Y. K. K. (2017). The HRD-Algorithm: a general method for parametric estimation of two-component mixture models. Paper presented at Analytical and Computational Methods in Probability Theory. ACMPT 2017.. Lecture Notes in Computer Science, 10684, 497-508
Open this publication in new window or tab >>The HRD-Algorithm: a general method for parametric estimation of two-component mixture models
2017 (English)In: Lecture Notes in Computer Science, ISSN 978-3-319-71504-9, Vol. 10684, p. 497-508Article in journal (Refereed) Published
Abstract [en]

We introduce a novel approach to estimate the parameters of a mixture of two distributions. The method combines a grid approach with the method of moments and can be applied to a wide range of two-component mixture models. The grid approach enables the use of parallel computing and the method can easily be combined with resampling techniques. We derive the method for the special cases when the data are described by the mixture of two Weibull distributions or the mixture of two normal distributions, and apply the method on gene expression data from 409 ER+" role="presentation" style="box-sizing: border-box; display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; position: relative;">ER+ER+ breast cancer patients.

Place, publisher, year, edition, pages
Springer, 2017
Keywords
Mixture models, Parameter estimation, Method of moments Grid-approach, Resampling, Cluster analysis Variable selection, High-dimensional data
National Category
Natural Sciences Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
urn:nbn:se:umu:diva-144646 (URN)10.1007/978-3-319-71504-9_41 (DOI)
Conference
Analytical and Computational Methods in Probability Theory. ACMPT 2017.
Funder
Swedish Research Council, 340-2013-5185
Available from: 2018-02-08 Created: 2018-02-08 Last updated: 2018-10-15Bibliographically approved
Källberg, D. & Seleznjev, O. (2016). Estimation of entropy-type integral functionals. Communications in Statistics - Theory and Methods, 45(4), 887-905
Open this publication in new window or tab >>Estimation of entropy-type integral functionals
2016 (English)In: Communications in Statistics - Theory and Methods, ISSN 0361-0926, E-ISSN 1532-415X, Vol. 45, no 4, p. 887-905Article in journal (Other academic) Published
Abstract [en]

Entropy-type integral functionals of densities are widely used in mathematical statistics, information theory, and computer science. Examples include measures of closeness between distributions (e.g., density power divergence) and uncertainty characteristics for a random variable (e.g., Renyi entropy). In this paper, we study U-statistic estimators for a class of such functionals. The estimators are based on ε-close vector observations in the corresponding independent and identically distributed samples. We prove asymptotic properties of the estimators (consistency and asymptotic normality) under mild integrability and smoothness conditions for the densities. The results can be applied in diverse problems in mathematical statistics and computer science (e.g., distribution identication problems, approximate matching for random databases, two-sample problems).

Keywords
Divergence estimation, asymptotic normality, U-statistics, inter-point distances, quadratic functional, entropy estimation
National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
urn:nbn:se:umu:diva-60993 (URN)10.1080/03610926.2013.853789 (DOI)000370612900005 ()
Available from: 2012-11-06 Created: 2012-11-06 Last updated: 2018-06-08Bibliographically approved
Källberg, D., Leonenko, N. & Seleznjev, O. (2014). Statistical estimation of quadratic Rényi entropy for a stationary m-dependent sequence. Journal of nonparametric statistics (Print), 26(2), 385-411
Open this publication in new window or tab >>Statistical estimation of quadratic Rényi entropy for a stationary m-dependent sequence
2014 (English)In: Journal of nonparametric statistics (Print), ISSN 1048-5252, E-ISSN 1029-0311, Vol. 26, no 2, p. 385-411Article in journal (Refereed) Published
Abstract [en]

The Rényi entropy is a generalization of the Shannon entropy and is widely used in mathematical statistics and applied sciences for quantifying the uncertainty in a probability distribution. We consider estimation of the quadratic Rényi entropy and related functionals for the marginal distribution of a stationary m-dependent sequence. The U-statistic estimators under study are based on the number of ε-close vector observations in the corresponding sample. A variety of asymptotic properties for these estimators are obtained (e.g., consistency, asymptotic normality, Poisson convergence). The results can be used in diverse statistical and computer science problems whenever the conventional independence assumption is too strong (e.g., ε-keys in time series databases, distribution identication problems for dependent samples).

Place, publisher, year, edition, pages
Taylor & Francis, 2014
Keywords
entropy estimation, quadratic Rényi entropy, stationary m-dependent sequence, U-statistics, inter-point distances
National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
urn:nbn:se:umu:diva-79958 (URN)10.1080/10485252.2013.854438 (DOI)000334160600011 ()
Note

Included in thesis 2013 in submitted form.

Available from: 2013-09-04 Created: 2013-09-04 Last updated: 2018-06-08Bibliographically approved
Källberg, D. (2013). Nonparametric Statistical Inference for Entropy-type Functionals. (Doctoral dissertation). Umeå: Umeå universitet
Open this publication in new window or tab >>Nonparametric Statistical Inference for Entropy-type Functionals
2013 (English)Doctoral thesis, comprehensive summary (Other academic)
Alternative title[sv]
Icke-parametrisk statistisk inferens för entropirelaterade funktionaler
Abstract [en]

In this thesis, we study statistical inference for entropy, divergence, and related functionals of one or two probability distributions. Asymptotic properties of particular nonparametric estimators of such functionals are investigated. We consider estimation from both independent and dependent observations. The thesis consists of an introductory survey of the subject and some related theory and four papers (A-D).

In Paper A, we consider a general class of entropy-type functionals which includes, for example, integer order Rényi entropy and certain Bregman divergences. We propose U-statistic estimators of these functionals based on the coincident or epsilon-close vector observations in the corresponding independent and identically distributed samples. We prove some asymptotic properties of the estimators such as consistency and asymptotic normality. Applications of the obtained results related to entropy maximizing distributions, stochastic databases, and image matching are discussed.

In Paper B, we provide some important generalizations of the results for continuous distributions in Paper A. The consistency of the estimators is obtained under weaker density assumptions. Moreover, we introduce a class of functionals of quadratic order, including both entropy and divergence, and prove normal limit results for the corresponding estimators which are valid even for densities of low smoothness. The asymptotic properties of a divergence-based two-sample test are also derived.

In Paper C, we consider estimation of the quadratic Rényi entropy and some related functionals for the marginal distribution of a stationary m-dependent sequence. We investigate asymptotic properties of the U-statistic estimators for these functionals introduced in Papers A and B when they are based on a sample from such a sequence. We prove consistency, asymptotic normality, and Poisson convergence under mild assumptions for the stationary m-dependent sequence. Applications of the results to time-series databases and entropy-based testing for dependent samples are discussed.

In Paper D, we further develop the approach for estimation of quadratic functionals with m-dependent observations introduced in Paper C. We consider quadratic functionals for one or two distributions. The consistency and rate of convergence of the corresponding U-statistic estimators are obtained under weak conditions on the stationary m-dependent sequences. Additionally, we propose estimators based on incomplete U-statistics and show their consistency properties under more general assumptions.

Place, publisher, year, edition, pages
Umeå: Umeå universitet, 2013. p. 21
Keywords
entropy estimation, Rényi entropy, divergence estimation, quadratic density functional, U-statistics, consistency, asymptotic normality, Poisson convergence, stationary m-dependent sequence, inter-point distances, entropy maximizing distribution, two-sample problem, approximate matching
National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
urn:nbn:se:umu:diva-79976 (URN)978-91-7459-701-1 (ISBN)
Public defence
2013-09-27, MIT-huset, MA121, Umeå universitet, Umeå, 10:00 (English)
Opponent
Supervisors
Available from: 2013-09-06 Created: 2013-09-04 Last updated: 2018-06-08Bibliographically approved
Källberg, D., Leonenko, N. & Oleg, S. (2012). Statistical inference for Rényi entropy functionals. In: Antje Düsterhöft, Meike Klettke, Klaus-Dieter Schewe (Ed.), Conceptual modelling and its theoretical foundations: (pp. 36-51). Springer Berlin/Heidelberg
Open this publication in new window or tab >>Statistical inference for Rényi entropy functionals
2012 (English)In: Conceptual modelling and its theoretical foundations / [ed] Antje Düsterhöft, Meike Klettke, Klaus-Dieter Schewe, Springer Berlin/Heidelberg, 2012, p. 36-51Chapter in book (Refereed)
Abstract [en]

Numerous entropy-type characteristics (functionals) generalizing Rényi entropy are widely used in mathematical statistics, physics, information theory, and signal processing for characterizing uncertainty in probability distributions and distribution identification problems. We consider estimators of some entropy (integral) functionals for discrete and continuous distributions based on the number of epsilon-close vector records in the corresponding independent and identically distributed samples from two distributions. The proposed estimators are generalized U-statistics. We show the asymptotic properties of these estimators (e.g., consistency and asymptotic normality). The results can be applied in various problems in computer science and mathematical statistics (e.g., approximate matching for random databases, record linkage, image matching).

Place, publisher, year, edition, pages
Springer Berlin/Heidelberg, 2012
Series
Lecture Notes in Computer Science, ISSN 0302-9743 ; 7260/2012
Keywords
entropy estimation, Rényi entropy, U-statistics, approximate matching, asymptotic normality
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:umu:diva-60788 (URN)10.1007/978-3-642-28279-9_5 (DOI)978-3-642-28278-2 (ISBN)978-3-642-28279-9 (ISBN)
Available from: 2012-11-14 Created: 2012-10-29 Last updated: 2018-06-08Bibliographically approved
Källberg, D. & Seleznjev, O.Estimation of quadratic density functionals under m-dependence.
Open this publication in new window or tab >>Estimation of quadratic density functionals under m-dependence
(English)Manuscript (preprint) (Other academic)
Abstract [en]

In this paper, we study estimation of certain integral functionals of one or two densities with samples from stationary m-dependent sequences. We consider two types of U-statistic estimators for these functionals that are functions of the number of epsilon-close vector observations in the samples. We show that the estimators are consistent and obtain their rates of convergence under weak distributional assumptions. In particular, we propose estimators based on incomplete U-statistics which have favorable consistency properties even when m-dependence is the only dependence condition that can be imposed on the stationary sequences. The results can be used for divergence and entropy estimation, and thus find many applications in statistics and applied sciences.

Keywords
quadratic density functional, entropy estimation, divergence estimation, stationary m-dependent sequences, Renyi entropy, incomplete U-statistics
National Category
Probability Theory and Statistics
Research subject
Mathematical Statistics
Identifiers
urn:nbn:se:umu:diva-79956 (URN)
Available from: 2013-09-04 Created: 2013-09-04 Last updated: 2018-06-08Bibliographically approved
Organisations

Search in DiVA

Show all publications