umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Estimating prediction error: cross-validation vs. accumulated prediction error
Umeå University, Faculty of Social Sciences, Department of Statistics.
Umeå University, Faculty of Social Sciences, Department of Statistics.
2010 (English)In: Communications in statistics. Simulation and computation, ISSN 0361-0918, E-ISSN 1532-4141, Vol. 39, no 5, 880-898 p.Article in journal (Refereed) Published
Abstract [en]

We study the validation of prediction rules such as regression models and classification algorithms through two out-of-sample strategies, cross-validation and accumulated prediction error. We use the framework of Efron (1983) where measures of prediction errors are defined as sample averages of expected errors and show through exact finite sample calculations that cross-validation and accumulated prediction error yield different smoothing parameter choices in nonparametric regression. The difference in choice does not vanish as sample size increases.

Place, publisher, year, edition, pages
Informa plc. , 2010. Vol. 39, no 5, 880-898 p.
Keyword [en]
Local polynomial regression, Nonparametric regression, Out-of-sample validation, Smoothing parameter
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
URN: urn:nbn:se:umu:diva-34149DOI: 10.1080/03610911003650409ISI: 000277568500002OAI: oai:DiVA.org:umu-34149DiVA: diva2:319164
Available from: 2010-05-14 Created: 2010-05-14 Last updated: 2017-12-12Bibliographically approved
In thesis
1. Selection of smoothing parameters with application in causal inference
Open this publication in new window or tab >>Selection of smoothing parameters with application in causal inference
2011 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

This thesis is a contribution to the research area concerned with selection of smoothing parameters in the framework of nonparametric and semiparametric regression. Selection of smoothing parameters is one of the most important issues in this framework and the choice can heavily influence subsequent results. A nonparametric or semiparametric approach is often desirable when large datasets are available since this allow us to make fewer and weaker assumptions as opposed to what is needed in a parametric approach. In the first paper we consider smoothing parameter selection in nonparametric regression when the purpose is to accurately predict future or unobserved data. We study the use of accumulated prediction errors and make comparisons to leave-one-out cross-validation which is widely used by practitioners. In the second paper a general semiparametric additive model is considered and the focus is on selection of smoothing parameters when optimal estimation of some specific parameter is of interest. We introduce a double smoothing estimator of a mean squared error and propose to select smoothing parameters by minimizing this estimator. Our approach is compared with existing methods.The third paper is concerned with the selection of smoothing parameters optimal for estimating average treatment effects defined within the potential outcome framework. For this estimation problem we propose double smoothing methods similar to the method proposed in the second paper. Theoretical properties of the proposed methods are derived and comparisons with existing methods are made by simulations.In the last paper we apply our results from the third paper by using a double smoothing method for selecting smoothing parameters when estimating average treatment effects on the treated. We estimate the effect on BMI of divorcing in middle age. Rich data on socioeconomic conditions, health and lifestyle from Swedish longitudinal registers is used.

Place, publisher, year, edition, pages
Umeå: Statistiska institutionen, Umeå universitet, 2011. 27 p.
Series
Statistical studies, ISSN 1100-8989 ; 44
Keyword
Smoothing parameter selection, Nonparametric regression, Semiparametric additive model, Double smoothing, Causal inference, BMI, Divorce
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
urn:nbn:se:umu:diva-39614 (URN)978-91-7459-147-7 (ISBN)
Public defence
2011-03-04, Norra Beteendevetarhuset Hörsal 1031, Umeå universitet, Umeå, 10:15 (English)
Opponent
Supervisors
Available from: 2011-02-11 Created: 2011-02-02 Last updated: 2011-02-11Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Authority records BETA

Häggström, Jennyde Luna, Xavier

Search in DiVA

By author/editor
Häggström, Jennyde Luna, Xavier
By organisation
Department of Statistics
In the same journal
Communications in statistics. Simulation and computation
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 255 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf