Doubly Robust Off-Policy Learning on Low-Dimensional Manifolds by Deep Neural Networks
Causal inference explores the causation between actions and the consequent rewards on a covariate set. Recently deep learning has achieved a remarkable performance in causal inference, but existing statistical theories cannot well explain such an empirical success, especially when the covariates are high-dimensional. Most theoretical results in causal inference are asymptotic, suffer from the curse of dimensionality, and only work for the finite-action scenario. To bridge such a gap between theory and practice, this paper studies doubly robust off-policy learning by deep neural networks. When the covariates lie on a low-dimensional manifold, we prove nonasymptotic regret bounds, which converge at a fast rate depending on the intrinsic dimension of the manifold. Our results cover both the finite- and continuous-action scenarios. Our theory shows that deep neural networks are adaptive to the low-dimensional geometric structures of the covariates, and partially explains the success of deep learning for causal inference.
READ FULL TEXT