Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Valid causal inference in high-dimensional and complex settings
Umeå University, Faculty of Social Sciences, Umeå School of Business and Economics (USBE), Statistics.ORCID iD: 0000-0001-5442-9708
2022 (English)Doctoral thesis, comprehensive summary (Other academic)Alternative title
Giltig kausalinferens med högdimensionella och komplexa data (Swedish)
Abstract [en]

The objective of this thesis is to consider some challenges that arise when conducting causal inference based on observational data. High dimensionality can occur when it is necessary to adjust for many covariates, and flexible models must be used to meet convergence assumptions. The latter may require the use of a novel machine learning estimator. Estimating nonparametrically-defined causal estimands at parametric rates and obtaining good-quality confidence intervals (with near nominal coverage) are the primary goals. Another challenge is providing a sensitivity analysis that can be applied in high-dimensional scenarios as a way of assessing the robustness of the results to missing confounders. 

Four papers are included in the thesis. A common theme in all the papers is covariate selection or nonparametric estimation of nuisance models. To provide insight into the performance of the approaches presented, some theoretical results are provided. Additionally, simulation studies are reported. In paper I, covariate selection is discussed as a method for removing redundant variables. This approach is compared to other strategies for variable selection that ensure reasonable confidence interval coverage. Paper II integrates variable selection into a sensitivity analysis, where the sensitivity parameter is the conditional correlation of the outcome and treatment variables. The validity of the analysis where the sensitivity parameter is small relative to the sample size is shown theoretically. In simulation settings, however, the analysis performs as expected, even for larger values of sensitivity parameters, when using a correction of the estimator of the residual variance for the outcome model. Paper IV extends the applicability of the sensitivity analysis method through the use of a different residual variance estimator and applies it to a real study of the effects of smoking during pregnancy on child birth weight. A real data problem of analysing the effect of early retirement on health outcomes is studied in Paper III. Rather than using variable selection strategies, convolutional neural networks are studied to fit the nuisance models.

Place, publisher, year, edition, pages
Umeå: Umeå University , 2022. , p. 14
Series
Statistical studies, ISSN 1100-8989 ; 56
Keywords [en]
Causal inference, high dimension, sensitivity analysis, variable selection, convolutional neural network, semiparametric efficiency bound
National Category
Probability Theory and Statistics
Research subject
Statistics
Identifiers
URN: urn:nbn:se:umu:diva-199258ISBN: 978-91-7855-881-0 (print)ISBN: 978-91-7855-882-7 (electronic)OAI: oai:DiVA.org:umu-199258DiVA, id: diva2:1694417
Public defence
2022-10-07, Hörsal NBET.A.101, Norra Beteendevetarhuset, Umeå, 10:00 (English)
Opponent
Supervisors
Available from: 2022-09-16 Created: 2022-09-09 Last updated: 2024-06-05Bibliographically approved
List of papers
1. The costs and benefits of uniformly valid causal inference with high-dimensional nuisance parameters
Open this publication in new window or tab >>The costs and benefits of uniformly valid causal inference with high-dimensional nuisance parameters
2023 (English)In: Statistical Science, ISSN 0883-4237, E-ISSN 2168-8745, Vol. 38, no 1, p. 1-12Article in journal (Refereed) Published
Abstract [en]

Important advances have recently been achieved in developing procedures yielding uniformly valid inference for a low dimensional causal parameter when high-dimensional nuisance models must be estimated. In this paper, we review the literature on uniformly valid causal inference and discuss the costs and benefits of using uniformly valid inference procedures. Naive estimation strategies based on regularisation, machine learning, or a preliminary model selection stage for the nuisance models have finite sample distributions which are badly approximated by their asymptotic distributions. To solve this serious problem, estimators which converge uniformly in distribution over a class of data generating mechanisms have been proposed in the literature. In order to obtain uniformly valid results in high-dimensional situations, sparsity conditions for the nuisance models need typically to be made, although a double robustness property holds, whereby if one of the nuisance model is more sparse, the other nuisance model is allowed to be less sparse. While uniformly valid inference is a highly desirable property, uniformly valid procedures pay a high price in terms of inflated variability. Our discussion of this dilemma is illustrated by the study of a double-selection outcome regression estimator, which we show is uniformly asymptotically unbiased, but is less variable than uniformly valid estimators in the numerical experiments conducted. 

Place, publisher, year, edition, pages
Institute of Mathematical Statistics, 2023
Keywords
Double robustness, Machine learning, Post-model selection inference, Regularization, Superefficiency
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:umu:diva-199231 (URN)10.1214/21-STS843 (DOI)000991879600001 ()2-s2.0-85152060424 (Scopus ID)
Funder
Marianne and Marcus Wallenberg Foundation
Note

Originally included in thesis in manuscript form.

Available from: 2022-09-08 Created: 2022-09-08 Last updated: 2024-06-05Bibliographically approved
2. Valid causal inference: model selection and sensitivity to unobserved confounding in high-dimensional settings
Open this publication in new window or tab >>Valid causal inference: model selection and sensitivity to unobserved confounding in high-dimensional settings
(English)Manuscript (preprint) (Other academic)
Abstract [en]

Recently, various methods have been proposed to estimate causal effects with confidence intervals that are uniformly valid over a set of data generating processes, when high-dimensional nuisance models are estimated by post-model selection or machine learning estimators. These methods typically require that all the confounders are observed to ensure identification of the effects. We contribute by showing how valid semiparametric inference can be obtained in the presence of unobserved confounders and high-dimensional nuisance models. We propose uncertainty intervals which allow for unobserved confounding, and show that the resulting inference is valid when the amount of unobserved confounding is small relative to the sample size; the latter is formalized in terms of convergence rates. Simulation experiments illustrate the finite sample properties of the proposed intervals and investigate an alternative procedure that improves the empirical coverage of the intervals when the amount of unobserved confounding is large.

Keywords
Average treatment effect, Inverse probability weighting, Double robust estimator
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:umu:diva-199233 (URN)
Funder
Marianne and Marcus Wallenberg Foundation
Available from: 2022-09-08 Created: 2022-09-08 Last updated: 2024-06-05
3. Convolutional neural networks for valid and efficient causal inference
Open this publication in new window or tab >>Convolutional neural networks for valid and efficient causal inference
2024 (English)In: Journal of Computational And Graphical Statistics, ISSN 1061-8600, E-ISSN 1537-2715, Vol. 33, no 2, p. 714-723Article in journal (Refereed) Published
Abstract [en]

Convolutional neural networks (CNN) have been successful in machine learning applications including image classification. When it comes to images, their success relies on their ability to consider the space invariant local features in the data. Here, we consider the use of CNN to fit nuisance models in semiparametric estimation of a one dimensional causal parameter: the average causal effect of a binary treatment. In this setting, nuisance models are functions of pre-treatment covariates that need to be controlled for. In an application where we want to estimate the effect of early retirement on a health outcome, we propose to use CNN to control for time-structured covariates. Thus, CNN is used when fitting nuisance models explaining the treatment assignment and the outcome. These fits are then combined into an augmented inverse probability weighting estimator yielding efficient and uniformly valid inference. Theoretically, we contribute by providing rates of convergence for CNN equipped with the rectified linear unit activation function and compare it to an existing result for feedforward neural networks. We also show when those rates guarantee uniformly valid inference for the proposed estimator. A Monte Carlo study is provided where the performance of the proposed estimator is evaluated and compared with other strategies. Finally, we give results on a study of the effect of early retirement on later hospitalization using a database covering the whole Swedish population.

Place, publisher, year, edition, pages
Taylor & Francis, 2024
Keywords
Average causal effect, augmented inverse probability weighting, early retirement, rate double robustness, post-machine learning inference
National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:umu:diva-199235 (URN)10.1080/10618600.2023.2257247 (DOI)001119527600001 ()2-s2.0-85174611652 (Scopus ID)
Funder
Marianne and Marcus Wallenberg FoundationSwedish Research Council
Note

Orinally included in thesis in manuscript form. 

Available from: 2022-09-08 Created: 2022-09-08 Last updated: 2024-10-01Bibliographically approved
4. A note on sensitivity analysis for post-machine learning causal inference
Open this publication in new window or tab >>A note on sensitivity analysis for post-machine learning causal inference
(English)Manuscript (preprint) (Other academic)
Abstract [en]

In Moosavi et al. (2022) a sensitivity analysis method to unobserved confounding was proposed when estimating an average causal effect with a double robust estimator in high dimensional situations. For this purpose, it was assumed that linear models could sparselyapproximate the nuisance functions (treatment assignment and outcome models). In this note, we relax these assumptions making the sensitivity analysis more generally applicable, for instance when nuisance functions are (weakly) consistently estimated with machine learning algorithms. Simulations and a case study illustrate the performance and use of the method.

National Category
Probability Theory and Statistics
Identifiers
urn:nbn:se:umu:diva-199257 (URN)
Funder
Marianne and Marcus Wallenberg Foundation
Available from: 2022-09-09 Created: 2022-09-09 Last updated: 2024-06-05

Open Access in DiVA

fulltext(299 kB)496 downloads
File information
File name FULLTEXT01.pdfFile size 299 kBChecksum SHA-512
492eb86188010f43bd54a04edf91a8378bcd74d98f2e31066df9672b417a90210f96e49ae472ead4d9346867333e794ced9e769f66c958dacb1b509c59c012f2
Type fulltextMimetype application/pdf
spikblad(126 kB)76 downloads
File information
File name SPIKBLAD01.pdfFile size 126 kBChecksum SHA-512
5417c01726098f47045f67231542aae79fd24c63c5029bbe90bd9099c6202d4b6fcdc18aa95febb3426448f407cd048ace3852e7fb13b7e2115f1b347b5f29e1
Type spikbladMimetype application/pdf

Authority records

Moosavi, Niloofar

Search in DiVA

By author/editor
Moosavi, Niloofar
By organisation
Statistics
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar
Total: 496 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 887 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf