Causal Inference in High Dimensions – Without Sparsity
We revisit the classical causal inference problem of estimating the average treatment effect in the presence of fully observed, high dimensional confounding variables, where the number of confounders d is of the same order as the sample size n. To make the problem tractable, we posit a generalized linear model for the effect of the confounders on the treatment assignment and outcomes, but do not assume any sparsity. Instead, we only require the magnitude of confounding to remain non-degenerate. Despite making parametric assumptions, this setting is a useful surrogate for some machine learning methods used to adjust for confounding in two-stage methods. In particular, the estimation of the first stage adds variance that does not vanish, forcing us to confront terms in the asymptotic expansion that normally are brushed aside as finite sample defects. Using new theoretical results and simulations, we compare G-computation, inverse propensity weighting (IPW), and two common doubly robust estimators – augmented IPW (AIPW) and targeted maximum likelihood estimation (TMLE). When the outcome model estimates are unbiased, G-computation outperforms the other estimators in both bias and variance. Among the doubly robust estimators, the TMLE estimator has the lowest variance. Existing theoretical results do not explain this advantage because the TMLE and AIPW estimators have the same asymptotic influence function. However, our model emphasizes differences in performance between these estimators beyond first-order asymptotics.
READ FULL TEXT