A New Covariate Selection Strategy for High Dimensional Data in Causal Effect Estimation with Multivariate Treatments

by   Juan Chen, et al.

Selection of covariates is crucial in the estimation of average treatment effects given observational data with high or even ultra-high dimensional pretreatment variables. Existing methods for this problem typically assume sparse linear models for both outcome and univariate treatment, and cannot handle situations with ultra-high dimensional covariates. In this paper, we propose a new covariate selection strategy called double screening prior adaptive lasso (DSPAL) to select confounders and predictors of the outcome for multivariate treatments, which combines the adaptive lasso method with the marginal conditional (in)dependence prior information to select target covariates, in order to eliminate confounding bias and improve statistical efficiency. The distinctive features of our proposal are that it can be applied to high-dimensional or even ultra-high dimensional covariates for multivariate treatments, and can deal with the cases of both parametric and nonparametric outcome models, which makes it more robust compared to other methods. Our theoretical analyses show that the proposed procedure enjoys the sure screening property, the ranking consistency property and the variable selection consistency. Through a simulation study, we demonstrate that the proposed approach selects all confounders and predictors consistently and estimates the multivariate treatment effects with smaller bias and mean squared error compared to several alternatives under various scenarios. In real data analysis, the method is applied to estimate the causal effect of a three-dimensional continuous environmental treatment on cholesterol level and enlightening results are obtained.


page 1

page 2

page 3

page 4


Outcome model free causal inference with ultra-high dimensional covariates

Causal inference has been increasingly reliant on observational studies ...

Consistent Fixed-Effects Selection in Ultra-high dimensional Linear Mixed Models with Error-Covariate Endogeneity

Recently, applied sciences, including longitudinal and clustered studies...

Selection of the Optimal Personalized Treatment from Multiple Treatments with Right-censored Multivariate Outcome Measures

We propose a novel personalized concept for the optimal treatment select...

Causal Mediation Analysis: Selection with Asymptotically Valid Inference

Researchers are often interested in learning not only the effect of trea...

Prediction of treatment outcome for autism from structure of the brain based on sure independence screening

Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder,...

Please sign up or login with your details

Forgot password? Click here to reset