Umeå universitets logga

umu.sePublikationer
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
The costs and benefits of uniformly valid causal inference with high-dimensional nuisance parameters
Umeå universitet, Samhällsvetenskapliga fakulteten, Handelshögskolan vid Umeå universitet, Statistik. (stat4reg)ORCID-id: 0000-0001-5442-9708
Umeå universitet, Samhällsvetenskapliga fakulteten, Handelshögskolan vid Umeå universitet, Statistik. (stat4reg)ORCID-id: 0000-0002-9086-7403
Umeå universitet, Samhällsvetenskapliga fakulteten, Handelshögskolan vid Umeå universitet, Statistik. (stat4reg)ORCID-id: 0000-0003-3187-1987
2023 (Engelska)Ingår i: Statistical Science, ISSN 0883-4237, E-ISSN 2168-8745, Vol. 38, nr 1, s. 1-12Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Important advances have recently been achieved in developing procedures yielding uniformly valid inference for a low dimensional causal parameter when high-dimensional nuisance models must be estimated. In this paper, we review the literature on uniformly valid causal inference and discuss the costs and benefits of using uniformly valid inference procedures. Naive estimation strategies based on regularisation, machine learning, or a preliminary model selection stage for the nuisance models have finite sample distributions which are badly approximated by their asymptotic distributions. To solve this serious problem, estimators which converge uniformly in distribution over a class of data generating mechanisms have been proposed in the literature. In order to obtain uniformly valid results in high-dimensional situations, sparsity conditions for the nuisance models need typically to be made, although a double robustness property holds, whereby if one of the nuisance model is more sparse, the other nuisance model is allowed to be less sparse. While uniformly valid inference is a highly desirable property, uniformly valid procedures pay a high price in terms of inflated variability. Our discussion of this dilemma is illustrated by the study of a double-selection outcome regression estimator, which we show is uniformly asymptotically unbiased, but is less variable than uniformly valid estimators in the numerical experiments conducted. 

Ort, förlag, år, upplaga, sidor
Institute of Mathematical Statistics, 2023. Vol. 38, nr 1, s. 1-12
Nyckelord [en]
Double robustness, Machine learning, Post-model selection inference, Regularization, Superefficiency
Nationell ämneskategori
Sannolikhetsteori och statistik
Identifikatorer
URN: urn:nbn:se:umu:diva-199231DOI: 10.1214/21-STS843ISI: 000991879600001Scopus ID: 2-s2.0-85152060424OAI: oai:DiVA.org:umu-199231DiVA, id: diva2:1693941
Forskningsfinansiär
Marianne och Marcus Wallenbergs Stiftelse
Anmärkning

Originally included in thesis in manuscript form.

Tillgänglig från: 2022-09-08 Skapad: 2022-09-08 Senast uppdaterad: 2024-06-05Bibliografiskt granskad
Ingår i avhandling
1. Valid causal inference in high-dimensional and complex settings
Öppna denna publikation i ny flik eller fönster >>Valid causal inference in high-dimensional and complex settings
2022 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Alternativ titel[sv]
Giltig kausalinferens med högdimensionella och komplexa data
Abstract [en]

The objective of this thesis is to consider some challenges that arise when conducting causal inference based on observational data. High dimensionality can occur when it is necessary to adjust for many covariates, and flexible models must be used to meet convergence assumptions. The latter may require the use of a novel machine learning estimator. Estimating nonparametrically-defined causal estimands at parametric rates and obtaining good-quality confidence intervals (with near nominal coverage) are the primary goals. Another challenge is providing a sensitivity analysis that can be applied in high-dimensional scenarios as a way of assessing the robustness of the results to missing confounders. 

Four papers are included in the thesis. A common theme in all the papers is covariate selection or nonparametric estimation of nuisance models. To provide insight into the performance of the approaches presented, some theoretical results are provided. Additionally, simulation studies are reported. In paper I, covariate selection is discussed as a method for removing redundant variables. This approach is compared to other strategies for variable selection that ensure reasonable confidence interval coverage. Paper II integrates variable selection into a sensitivity analysis, where the sensitivity parameter is the conditional correlation of the outcome and treatment variables. The validity of the analysis where the sensitivity parameter is small relative to the sample size is shown theoretically. In simulation settings, however, the analysis performs as expected, even for larger values of sensitivity parameters, when using a correction of the estimator of the residual variance for the outcome model. Paper IV extends the applicability of the sensitivity analysis method through the use of a different residual variance estimator and applies it to a real study of the effects of smoking during pregnancy on child birth weight. A real data problem of analysing the effect of early retirement on health outcomes is studied in Paper III. Rather than using variable selection strategies, convolutional neural networks are studied to fit the nuisance models.

Ort, förlag, år, upplaga, sidor
Umeå: Umeå University, 2022. s. 14
Serie
Statistical studies, ISSN 1100-8989 ; 56
Nyckelord
Causal inference, high dimension, sensitivity analysis, variable selection, convolutional neural network, semiparametric efficiency bound
Nationell ämneskategori
Sannolikhetsteori och statistik
Forskningsämne
statistik
Identifikatorer
urn:nbn:se:umu:diva-199258 (URN)978-91-7855-881-0 (ISBN)978-91-7855-882-7 (ISBN)
Disputation
2022-10-07, Hörsal NBET.A.101, Norra Beteendevetarhuset, Umeå, 10:00 (Engelska)
Opponent
Handledare
Tillgänglig från: 2022-09-16 Skapad: 2022-09-09 Senast uppdaterad: 2024-06-05Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopusPreprint

Person

Moosavi, NiloofarHäggström, Jennyde Luna, Xavier

Sök vidare i DiVA

Av författaren/redaktören
Moosavi, NiloofarHäggström, Jennyde Luna, Xavier
Av organisationen
Statistik
I samma tidskrift
Statistical Science
Sannolikhetsteori och statistik

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetricpoäng

doi
urn-nbn
Totalt: 453 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf