umu.sePublications

Please wait ... |

Link to record
http://umu.diva-portal.org/smash/person.jsf?pid=authority-person:66353 $(function(){PrimeFaces.cw("InputTextarea","widget_formSmash_upper_j_idt122_recordDirectLink",{id:"formSmash:upper:j_idt122:recordDirectLink",widgetVar:"widget_formSmash_upper_j_idt122_recordDirectLink",autoResize:true});}); $(function(){PrimeFaces.cw("OverlayPanel","widget_formSmash_upper_j_idt122_j_idt124",{id:"formSmash:upper:j_idt122:j_idt124",widgetVar:"widget_formSmash_upper_j_idt122_j_idt124",target:"formSmash:upper:j_idt122:permLink",showEffect:"blind",hideEffect:"fade",my:"right top",at:"right bottom",showCloseIcon:true});});

Permanent link

Direct link

Källberg, David

Open this publication in new window or tab >>Cluster analysis on high dimensional RNA-seq data with applications to cancer research: An evaluation study### Vidman, Linda

### Källberg, David

### Rydén, Patrik

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_0_j_idt188_some",{id:"formSmash:j_idt184:0:j_idt188:some",widgetVar:"widget_formSmash_j_idt184_0_j_idt188_some",multiple:true}); PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_0_j_idt188_otherAuthors",{id:"formSmash:j_idt184:0:j_idt188:otherAuthors",widgetVar:"widget_formSmash_j_idt184_0_j_idt188_otherAuthors",multiple:true}); 2019 (English)In: PLoS ONE, E-ISSN 1932-6203, Vol. 14, no 12, article id e0219102Article in journal (Refereed) Published
##### Abstract [en]

##### Place, publisher, year, edition, pages

San Francisco: Public Library of Science, 2019
##### Keywords

Cancer, cluster analysis
##### National Category

Probability Theory and Statistics Bioinformatics and Systems Biology
##### Identifiers

urn:nbn:se:umu:diva-167274 (URN)10.1371/journal.pone.0219102 (DOI)31805048 (PubMedID)
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_0_j_idt188_j_idt359",{id:"formSmash:j_idt184:0:j_idt188:j_idt359",widgetVar:"widget_formSmash_j_idt184_0_j_idt188_j_idt359",multiple:true});
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_0_j_idt188_j_idt365",{id:"formSmash:j_idt184:0:j_idt188:j_idt365",widgetVar:"widget_formSmash_j_idt184_0_j_idt188_j_idt365",multiple:true});
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_0_j_idt188_j_idt371",{id:"formSmash:j_idt184:0:j_idt188:j_idt371",widgetVar:"widget_formSmash_j_idt184_0_j_idt188_j_idt371",multiple:true});
#####

Available from: 2020-01-14 Created: 2020-01-14 Last updated: 2020-01-15Bibliographically approved

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics. Umeå University, Faculty of Social Sciences, Umeå School of Business and Economics (USBE), Statistics.

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.

Background: Clustering of gene expression data is widely used to identify novel subtypes of cancer. Plenty of clustering approaches have been proposed, but there is a lack of knowledge regarding their relative merits and how data characteristics influence the performance. We evaluate how cluster analysis choices affect the performance by studying four publicly available human cancer data sets: breast, brain, kidney and stomach cancer. In particular, we focus on how the sample size, distribution of subtypes and sample heterogeneity affect the performance.

Results: In general, increasing the sample size had limited effect on the clustering performance, e.g. for the breast cancer data similar performance was obtained for n = 40 as for n = 330. The relative distribution of the subtypes had a noticeable effect on the ability to identify the disease subtypes and data with disproportionate cluster sizes turned out to be difficult to cluster. Both the choice of clustering method and selection method affected the ability to identify the subtypes, but the relative performance varied between data sets, making it difficult to rank the approaches. For some data sets, the performance was substantially higher when the clustering was based on data from only one sex compared to data from a mixed population. This suggests that homogeneous data are easier to cluster than heterogeneous data and that clustering males and females individually may be beneficial and increase the chance to detect novel subtypes. It was also observed that the performance often differed substantially between females and males.

Conclusions: The number of samples seems to have a limited effect on the performance while the heterogeneity, at least with respect to sex, is important for the performance. Hence, by analyzing the genders separately, the possible loss caused by having fewer samples could be outweighed by the benefit of a more homogeneous data.

Open this publication in new window or tab >>A moment-distance hybrid method for estimating a mixture of two symmetric densities### Källberg, David

### Belyaev, Yuri

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.### Rydén, Patrik

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_1_j_idt188_some",{id:"formSmash:j_idt184:1:j_idt188:some",widgetVar:"widget_formSmash_j_idt184_1_j_idt188_some",multiple:true}); PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_1_j_idt188_otherAuthors",{id:"formSmash:j_idt184:1:j_idt188:otherAuthors",widgetVar:"widget_formSmash_j_idt184_1_j_idt188_otherAuthors",multiple:true}); 2018 (English)In: Modern Stochastics: Theory and Applications, ISSN 2351-6054, Vol. 5, no 1, p. 1-36Article in journal (Refereed) Published
##### Abstract [en]

##### Keywords

inference for mixtures, method of moments, minimum distance, model-based clustering
##### National Category

Probability Theory and Statistics
##### Research subject

Mathematical Statistics
##### Identifiers

urn:nbn:se:umu:diva-144644 (URN)10.15559/17-VMSTA93 (DOI)000434875200001 ()
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_1_j_idt188_j_idt359",{id:"formSmash:j_idt184:1:j_idt188:j_idt359",widgetVar:"widget_formSmash_j_idt184_1_j_idt188_j_idt359",multiple:true});
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_1_j_idt188_j_idt365",{id:"formSmash:j_idt184:1:j_idt188:j_idt365",widgetVar:"widget_formSmash_j_idt184_1_j_idt188_j_idt365",multiple:true});
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_1_j_idt188_j_idt371",{id:"formSmash:j_idt184:1:j_idt188:j_idt371",widgetVar:"widget_formSmash_j_idt184_1_j_idt188_j_idt371",multiple:true});
#####

##### Funder

Swedish Research Council, 340-2013-5185
Available from: 2018-02-08 Created: 2018-02-08 Last updated: 2018-09-19Bibliographically approved

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.

In clustering of high-dimensional data a variable selection is commonly applied to obtain an accurate grouping of the samples. For two-class problems this selection may be carried out by fitting a mixture distribution to each variable. We propose a hybrid method for estimating a parametric mixture of two symmetric densities. The estimator combines the method of moments with the minimum distance approach. An evaluation study including both extensive simulations and gene expression data from acute leukemia patients shows that the hybrid method outperforms a maximum-likelihood estimator in model-based clustering. The hybrid estimator is flexible and performs well also under imprecise model assumptions, suggesting that it is robust and suited for real problems.

Open this publication in new window or tab >>The HRD-Algorithm: a general method for parametric estimation of two-component mixture models### patrik, Rydén

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.### Källberg, David

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.### Belyaev, Yu. K.

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_2_j_idt188_some",{id:"formSmash:j_idt184:2:j_idt188:some",widgetVar:"widget_formSmash_j_idt184_2_j_idt188_some",multiple:true}); PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_2_j_idt188_otherAuthors",{id:"formSmash:j_idt184:2:j_idt188:otherAuthors",widgetVar:"widget_formSmash_j_idt184_2_j_idt188_otherAuthors",multiple:true}); 2017 (English)In: Lecture Notes in Computer Science, ISSN 978-3-319-71504-9, Vol. 10684, p. 497-508Article in journal (Refereed) Published
##### Abstract [en]

##### Place, publisher, year, edition, pages

Springer, 2017
##### Keywords

Mixture models, Parameter estimation, Method of moments Grid-approach, Resampling, Cluster analysis Variable selection, High-dimensional data
##### National Category

Natural Sciences Probability Theory and Statistics
##### Research subject

Mathematical Statistics
##### Identifiers

urn:nbn:se:umu:diva-144646 (URN)10.1007/978-3-319-71504-9_41 (DOI)
##### Conference

Analytical and Computational Methods in Probability Theory. ACMPT 2017.
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_2_j_idt188_j_idt359",{id:"formSmash:j_idt184:2:j_idt188:j_idt359",widgetVar:"widget_formSmash_j_idt184_2_j_idt188_j_idt359",multiple:true});
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_2_j_idt188_j_idt365",{id:"formSmash:j_idt184:2:j_idt188:j_idt365",widgetVar:"widget_formSmash_j_idt184_2_j_idt188_j_idt365",multiple:true});
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_2_j_idt188_j_idt371",{id:"formSmash:j_idt184:2:j_idt188:j_idt371",widgetVar:"widget_formSmash_j_idt184_2_j_idt188_j_idt371",multiple:true});
#####

##### Funder

Swedish Research Council, 340-2013-5185
Available from: 2018-02-08 Created: 2018-02-08 Last updated: 2018-10-15Bibliographically approved

We introduce a novel approach to estimate the parameters of a mixture of two distributions. The method combines a grid approach with the method of moments and can be applied to a wide range of two-component mixture models. The grid approach enables the use of parallel computing and the method can easily be combined with resampling techniques. We derive the method for the special cases when the data are described by the mixture of two Weibull distributions or the mixture of two normal distributions, and apply the method on gene expression data from 409 ER+" role="presentation" style="box-sizing: border-box; display: inline-table; line-height: normal; letter-spacing: normal; word-spacing: normal; word-wrap: normal; white-space: nowrap; float: none; direction: ltr; max-width: none; max-height: none; min-width: 0px; min-height: 0px; border: 0px; padding: 0px; margin: 0px; position: relative;">ER+ER+ breast cancer patients.

Open this publication in new window or tab >>Estimation of entropy-type integral functionals### Källberg, David

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.### Seleznjev, Oleg

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_3_j_idt188_some",{id:"formSmash:j_idt184:3:j_idt188:some",widgetVar:"widget_formSmash_j_idt184_3_j_idt188_some",multiple:true}); PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_3_j_idt188_otherAuthors",{id:"formSmash:j_idt184:3:j_idt188:otherAuthors",widgetVar:"widget_formSmash_j_idt184_3_j_idt188_otherAuthors",multiple:true}); 2016 (English)In: Communications in Statistics - Theory and Methods, ISSN 0361-0926, E-ISSN 1532-415X, Vol. 45, no 4, p. 887-905Article in journal (Other academic) Published
##### Abstract [en]

##### Keywords

Divergence estimation, asymptotic normality, U-statistics, inter-point distances, quadratic functional, entropy estimation
##### National Category

Probability Theory and Statistics
##### Research subject

Mathematical Statistics
##### Identifiers

urn:nbn:se:umu:diva-60993 (URN)10.1080/03610926.2013.853789 (DOI)000370612900005 ()
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_3_j_idt188_j_idt359",{id:"formSmash:j_idt184:3:j_idt188:j_idt359",widgetVar:"widget_formSmash_j_idt184_3_j_idt188_j_idt359",multiple:true});
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_3_j_idt188_j_idt365",{id:"formSmash:j_idt184:3:j_idt188:j_idt365",widgetVar:"widget_formSmash_j_idt184_3_j_idt188_j_idt365",multiple:true});
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_3_j_idt188_j_idt371",{id:"formSmash:j_idt184:3:j_idt188:j_idt371",widgetVar:"widget_formSmash_j_idt184_3_j_idt188_j_idt371",multiple:true});
#####

Available from: 2012-11-06 Created: 2012-11-06 Last updated: 2018-06-08Bibliographically approved

Entropy-type integral functionals of densities are widely used in mathematical statistics, information theory, and computer science. Examples include measures of closeness between distributions (e.g., density power divergence) and uncertainty characteristics for a random variable (e.g., Renyi entropy). In this paper, we study *U*-statistic estimators for a class of such functionals. The estimators are based on ε-close vector observations in the corresponding independent and identically distributed samples. We prove asymptotic properties of the estimators (consistency and asymptotic normality) under mild integrability and smoothness conditions for the densities. The results can be applied in diverse problems in mathematical statistics and computer science (e.g., distribution identication problems, approximate matching for random databases, two-sample problems).

Open this publication in new window or tab >>Statistical estimation of quadratic Rényi entropy for a stationary *m*-dependent sequence### Källberg, David

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.### Leonenko, Nikolaj

### Seleznjev, Oleg

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_4_j_idt188_some",{id:"formSmash:j_idt184:4:j_idt188:some",widgetVar:"widget_formSmash_j_idt184_4_j_idt188_some",multiple:true}); PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_4_j_idt188_otherAuthors",{id:"formSmash:j_idt184:4:j_idt188:otherAuthors",widgetVar:"widget_formSmash_j_idt184_4_j_idt188_otherAuthors",multiple:true}); 2014 (English)In: Journal of nonparametric statistics (Print), ISSN 1048-5252, E-ISSN 1029-0311, Vol. 26, no 2, p. 385-411Article in journal (Refereed) Published
##### Abstract [en]

##### Place, publisher, year, edition, pages

Taylor & Francis, 2014
##### Keywords

entropy estimation, quadratic Rényi entropy, stationary m-dependent sequence, U-statistics, inter-point distances
##### National Category

Probability Theory and Statistics
##### Research subject

Mathematical Statistics
##### Identifiers

urn:nbn:se:umu:diva-79958 (URN)10.1080/10485252.2013.854438 (DOI)000334160600011 ()
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_4_j_idt188_j_idt359",{id:"formSmash:j_idt184:4:j_idt188:j_idt359",widgetVar:"widget_formSmash_j_idt184_4_j_idt188_j_idt359",multiple:true});
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_4_j_idt188_j_idt365",{id:"formSmash:j_idt184:4:j_idt188:j_idt365",widgetVar:"widget_formSmash_j_idt184_4_j_idt188_j_idt365",multiple:true});
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_4_j_idt188_j_idt371",{id:"formSmash:j_idt184:4:j_idt188:j_idt371",widgetVar:"widget_formSmash_j_idt184_4_j_idt188_j_idt371",multiple:true});
#####

##### Note

Cardiff University, School of Mathematics.

The Rényi entropy is a generalization of the Shannon entropy and is widely used in mathematical statistics and applied sciences for quantifying the uncertainty in a probability distribution. We consider estimation of the quadratic Rényi entropy and related functionals for the marginal distribution of a stationary *m*-dependent sequence. The *U*-statistic estimators under study are based on the number of ε-close vector observations in the corresponding sample. A variety of asymptotic properties for these estimators are obtained (e.g., consistency, asymptotic normality, Poisson convergence). The results can be used in diverse statistical and computer science problems whenever the conventional independence assumption is too strong (e.g., ε-keys in time series databases, distribution identication problems for dependent samples).

Included in thesis 2013 in submitted form.

Available from: 2013-09-04 Created: 2013-09-04 Last updated: 2018-06-08Bibliographically approvedOpen this publication in new window or tab >>Nonparametric Statistical Inference for Entropy-type Functionals### Källberg, David

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_5_j_idt188_some",{id:"formSmash:j_idt184:5:j_idt188:some",widgetVar:"widget_formSmash_j_idt184_5_j_idt188_some",multiple:true}); PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_5_j_idt188_otherAuthors",{id:"formSmash:j_idt184:5:j_idt188:otherAuthors",widgetVar:"widget_formSmash_j_idt184_5_j_idt188_otherAuthors",multiple:true}); 2013 (English)Doctoral thesis, comprehensive summary (Other academic)
##### Alternative title[sv]

Icke-parametrisk statistisk inferens för entropirelaterade funktionaler
##### Abstract [en]

##### Place, publisher, year, edition, pages

Umeå: Umeå universitet, 2013. p. 21
##### Keywords

entropy estimation, Rényi entropy, divergence estimation, quadratic density functional, U-statistics, consistency, asymptotic normality, Poisson convergence, stationary m-dependent sequence, inter-point distances, entropy maximizing distribution, two-sample problem, approximate matching
##### National Category

Probability Theory and Statistics
##### Research subject

Mathematical Statistics
##### Identifiers

urn:nbn:se:umu:diva-79976 (URN)978-91-7459-701-1 (ISBN)
##### Public defence

2013-09-27, MIT-huset, MA121, Umeå universitet, Umeå, 10:00 (English)
##### Opponent

### Koski, Timo

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_5_j_idt188_j_idt359",{id:"formSmash:j_idt184:5:j_idt188:j_idt359",widgetVar:"widget_formSmash_j_idt184_5_j_idt188_j_idt359",multiple:true});
##### Supervisors

### Seleznjev, Oleg

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_5_j_idt188_j_idt365",{id:"formSmash:j_idt184:5:j_idt188:j_idt365",widgetVar:"widget_formSmash_j_idt184_5_j_idt188_j_idt365",multiple:true});
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_5_j_idt188_j_idt371",{id:"formSmash:j_idt184:5:j_idt188:j_idt371",widgetVar:"widget_formSmash_j_idt184_5_j_idt188_j_idt371",multiple:true});
#####

Available from: 2013-09-06 Created: 2013-09-04 Last updated: 2018-06-08Bibliographically approved

In this thesis, we study statistical inference for entropy, divergence, and related functionals of one or two probability distributions. Asymptotic properties of particular nonparametric estimators of such functionals are investigated. We consider estimation from both independent and dependent observations. The thesis consists of an introductory survey of the subject and some related theory and four papers (A-D).

In Paper A, we consider a general class of entropy-type functionals which includes, for example, integer order Rényi entropy and certain Bregman divergences. We propose *U*-statistic estimators of these functionals based on the coincident or epsilon-close vector observations in the corresponding independent and identically distributed samples. We prove some asymptotic properties of the estimators such as consistency and asymptotic normality. Applications of the obtained results related to entropy maximizing distributions, stochastic databases, and image matching are discussed.

In Paper B, we provide some important generalizations of the results for continuous distributions in Paper A. The consistency of the estimators is obtained under weaker density assumptions. Moreover, we introduce a class of functionals of quadratic order, including both entropy and divergence, and prove normal limit results for the corresponding estimators which are valid even for densities of low smoothness. The asymptotic properties of a divergence-based two-sample test are also derived.

In Paper C, we consider estimation of the quadratic Rényi entropy and some related functionals for the marginal distribution of a stationary *m*-dependent sequence. We investigate asymptotic properties of the *U*-statistic estimators for these functionals introduced in Papers A and B when they are based on a sample from such a sequence. We prove consistency, asymptotic normality, and Poisson convergence under mild assumptions for the stationary *m*-dependent sequence. Applications of the results to time-series databases and entropy-based testing for dependent samples are discussed.

In Paper D, we further develop the approach for estimation of quadratic functionals with *m*-dependent observations introduced in Paper C. We consider quadratic functionals for one or two distributions. The consistency and rate of convergence of the corresponding *U*-statistic estimators are obtained under weak conditions on the stationary *m*-dependent sequences. Additionally, we propose estimators based on incomplete *U*-statistics and show their consistency properties under more general assumptions.

Institutionen för matematik, Kungliga Tekniska högskolan, Stockholm.

Open this publication in new window or tab >>Statistical inference for Rényi entropy functionals### Källberg, David

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.### Leonenko, Nikolaj

### Oleg, Seleznjev

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_6_j_idt188_some",{id:"formSmash:j_idt184:6:j_idt188:some",widgetVar:"widget_formSmash_j_idt184_6_j_idt188_some",multiple:true}); PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_6_j_idt188_otherAuthors",{id:"formSmash:j_idt184:6:j_idt188:otherAuthors",widgetVar:"widget_formSmash_j_idt184_6_j_idt188_otherAuthors",multiple:true}); 2012 (English)In: Conceptual modelling and its theoretical foundations / [ed] Antje Düsterhöft, Meike Klettke, Klaus-Dieter Schewe, Springer Berlin/Heidelberg, 2012, p. 36-51Chapter in book (Refereed)
##### Abstract [en]

##### Place, publisher, year, edition, pages

Springer Berlin/Heidelberg, 2012
##### Series

Lecture Notes in Computer Science, ISSN 0302-9743 ; 7260/2012
##### Keywords

entropy estimation, Rényi entropy, U-statistics, approximate matching, asymptotic normality
##### National Category

Probability Theory and Statistics
##### Identifiers

urn:nbn:se:umu:diva-60788 (URN)10.1007/978-3-642-28279-9_5 (DOI)978-3-642-28278-2 (ISBN)978-3-642-28279-9 (ISBN)
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_6_j_idt188_j_idt359",{id:"formSmash:j_idt184:6:j_idt188:j_idt359",widgetVar:"widget_formSmash_j_idt184_6_j_idt188_j_idt359",multiple:true});
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_6_j_idt188_j_idt365",{id:"formSmash:j_idt184:6:j_idt188:j_idt365",widgetVar:"widget_formSmash_j_idt184_6_j_idt188_j_idt365",multiple:true});
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_6_j_idt188_j_idt371",{id:"formSmash:j_idt184:6:j_idt188:j_idt371",widgetVar:"widget_formSmash_j_idt184_6_j_idt188_j_idt371",multiple:true});
#####

Available from: 2012-11-14 Created: 2012-10-29 Last updated: 2018-06-08Bibliographically approved

School of Mathematics, Cardiff University, Cardiff, UK.

Numerous entropy-type characteristics (functionals) generalizing Rényi entropy are widely used in mathematical statistics, physics, information theory, and signal processing for characterizing uncertainty in probability distributions and distribution identification problems. We consider estimators of some entropy (integral) functionals for discrete and continuous distributions based on the number of epsilon-close vector records in the corresponding independent and identically distributed samples from two distributions. The proposed estimators are generalized *U*-statistics. We show the asymptotic properties of these estimators (e.g., consistency and asymptotic normality). The results can be applied in various problems in computer science and mathematical statistics (e.g., approximate matching for random databases, record linkage, image matching).

Open this publication in new window or tab >>Combining epigenetic and clinicopathological variables improves prognostic prediction in clear cell Renal Cell Carcinoma### Andersson-Evelönn, Emma

### Vidman, Linda

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.### Källberg, David

### Landfors, Mattias

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_7_j_idt188_some",{id:"formSmash:j_idt184:7:j_idt188:some",widgetVar:"widget_formSmash_j_idt184_7_j_idt188_some",multiple:true}); ### Liu, Xijia

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.### Ljungberg, Börje

### Hultdin, Magnus

### Degerman, Sofie

### Rydén, Patrik

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_7_j_idt188_otherAuthors",{id:"formSmash:j_idt184:7:j_idt188:otherAuthors",widgetVar:"widget_formSmash_j_idt184_7_j_idt188_otherAuthors",multiple:true}); Show others...PrimeFaces.cw("SelectBooleanButton","widget_formSmash_j_idt184_7_j_idt188_j_idt202",{id:"formSmash:j_idt184:7:j_idt188:j_idt202",widgetVar:"widget_formSmash_j_idt184_7_j_idt188_j_idt202",onLabel:"Hide others...",offLabel:"Show others..."}); (English)Manuscript (preprint) (Other academic)
##### Keywords

DNA methylation, cancer, cluster analysis, classification, clear cell renal cell carcinoma
##### National Category

Cancer and Oncology Probability Theory and Statistics
##### Identifiers

urn:nbn:se:umu:diva-167269 (URN)
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_7_j_idt188_j_idt359",{id:"formSmash:j_idt184:7:j_idt188:j_idt359",widgetVar:"widget_formSmash_j_idt184_7_j_idt188_j_idt359",multiple:true});
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_7_j_idt188_j_idt365",{id:"formSmash:j_idt184:7:j_idt188:j_idt365",widgetVar:"widget_formSmash_j_idt184_7_j_idt188_j_idt365",multiple:true});
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_7_j_idt188_j_idt371",{id:"formSmash:j_idt184:7:j_idt188:j_idt371",widgetVar:"widget_formSmash_j_idt184_7_j_idt188_j_idt371",multiple:true});
#####

Available from: 2020-01-14 Created: 2020-01-14 Last updated: 2020-01-16Bibliographically approved

Umeå University, Faculty of Medicine, Department of Medical Biosciences, Pathology.

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics. Umeå University, Faculty of Social Sciences, Umeå School of Business and Economics (USBE), Statistics.

Umeå University, Faculty of Medicine, Department of Medical Biosciences, Pathology.

Umeå University, Faculty of Medicine, Department of Surgical and Perioperative Sciences, Urology and Andrology.

Umeå University, Faculty of Medicine, Department of Medical Biosciences, Pathology.

Umeå University, Faculty of Medicine, Department of Medical Biosciences, Pathology. Umeå University, Faculty of Medicine, Department of Clinical Microbiology.

Open this publication in new window or tab >>Comparison of methods for variable selection in clustering of high-dimensional RNA-sequencing data to identify cancer subtypes### Källberg, David

### Vidman, Linda

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.### Rydén, Patrik

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_8_j_idt188_some",{id:"formSmash:j_idt184:8:j_idt188:some",widgetVar:"widget_formSmash_j_idt184_8_j_idt188_some",multiple:true}); PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_8_j_idt188_otherAuthors",{id:"formSmash:j_idt184:8:j_idt188:otherAuthors",widgetVar:"widget_formSmash_j_idt184_8_j_idt188_otherAuthors",multiple:true}); (English)Manuscript (preprint) (Other academic)
##### Keywords

feature selection, clustering, RNA-seq, cancer
##### National Category

Probability Theory and Statistics
##### Identifiers

urn:nbn:se:umu:diva-167264 (URN)
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_8_j_idt188_j_idt359",{id:"formSmash:j_idt184:8:j_idt188:j_idt359",widgetVar:"widget_formSmash_j_idt184_8_j_idt188_j_idt359",multiple:true});
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_8_j_idt188_j_idt365",{id:"formSmash:j_idt184:8:j_idt188:j_idt365",widgetVar:"widget_formSmash_j_idt184_8_j_idt188_j_idt365",multiple:true});
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_8_j_idt188_j_idt371",{id:"formSmash:j_idt184:8:j_idt188:j_idt371",widgetVar:"widget_formSmash_j_idt184_8_j_idt188_j_idt371",multiple:true});
#####

Available from: 2020-01-14 Created: 2020-01-14 Last updated: 2020-01-16Bibliographically approved

Umeå University, Faculty of Social Sciences, Umeå School of Business and Economics (USBE), Statistics.

Open this publication in new window or tab >>Estimation of quadratic density functionals under *m*-dependence### Källberg, David

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.### Seleznjev, Oleg

Umeå University, Faculty of Science and Technology, Department of Mathematics and Mathematical Statistics.PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_9_j_idt188_some",{id:"formSmash:j_idt184:9:j_idt188:some",widgetVar:"widget_formSmash_j_idt184_9_j_idt188_some",multiple:true}); PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_9_j_idt188_otherAuthors",{id:"formSmash:j_idt184:9:j_idt188:otherAuthors",widgetVar:"widget_formSmash_j_idt184_9_j_idt188_otherAuthors",multiple:true}); (English)Manuscript (preprint) (Other academic)
##### Abstract [en]

##### Keywords

quadratic density functional, entropy estimation, divergence estimation, stationary m-dependent sequences, Renyi entropy, incomplete U-statistics
##### National Category

Probability Theory and Statistics
##### Research subject

Mathematical Statistics
##### Identifiers

urn:nbn:se:umu:diva-79956 (URN)
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_9_j_idt188_j_idt359",{id:"formSmash:j_idt184:9:j_idt188:j_idt359",widgetVar:"widget_formSmash_j_idt184_9_j_idt188_j_idt359",multiple:true});
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_9_j_idt188_j_idt365",{id:"formSmash:j_idt184:9:j_idt188:j_idt365",widgetVar:"widget_formSmash_j_idt184_9_j_idt188_j_idt365",multiple:true});
#####

PrimeFaces.cw("AccordionPanel","widget_formSmash_j_idt184_9_j_idt188_j_idt371",{id:"formSmash:j_idt184:9:j_idt188:j_idt371",widgetVar:"widget_formSmash_j_idt184_9_j_idt188_j_idt371",multiple:true});
#####

Available from: 2013-09-04 Created: 2013-09-04 Last updated: 2018-06-08Bibliographically approved

In this paper, we study estimation of certain integral functionals of one or two densities with samples from stationary *m*-dependent sequences. We consider two types of *U*-statistic estimators for these functionals that are functions of the number of epsilon-close vector observations in the samples. We show that the estimators are consistent and obtain their rates of convergence under weak distributional assumptions. In particular, we propose estimators based on incomplete *U*-statistics which have favorable consistency properties even when *m*-dependence is the only dependence condition that can be imposed on the stationary sequences. The results can be used for divergence and entropy estimation, and thus find many applications in statistics and applied sciences.