Data-driven algorithms for dimension reduction in causal inference
(English)Manuscript (preprint) (Other academic)
In observational studies, the causal effect of a treatment may be confounded with variables that are related to both the treatment and the outcome of interest. In order to identify a causal effect, such studies often rely on the unconfoundedness assumption, i.e., that all confounding variables are observed. The choice of covariates to control for, which is primarily based on subject matter knowledge, may result in a large covariate vector in the attempt to ensure that unconfoundedness holds. However, including redundant covariates can affect bias and efficiency of nonparametric causal effect estimators, e.g., due to the curse of dimensionality. In this paper, data-driven algo- rithms for the selection of sufficient covariate subsets are investigated. Under the assumption of unconfoundedness we search for minimal subsets of the covariate vector. Based on the framework of sufficient dimension reduction or kernel smoothing, the algorithms perform a backward elim- ination procedure testing the significance of each covariate. Their performance is evaluated in simulations and an application using data from the Swedish Childhood Diabetes Register is also presented.
covariate selection, marginal co-ordinate hypothesis test, matching, kernel smoothing, type 1 diabetes mellitus
Probability Theory and Statistics
Research subject Statistics
IdentifiersURN: urn:nbn:se:umu:diva-80696OAI: oai:DiVA.org:umu-80696DiVA: diva2:651007