Counterfactual inference for sequential experimental design
We consider the problem of counterfactual inference in sequentially designed experiments wherein a collection of 𝐍 units each undergo a sequence of interventions for 𝐓 time periods, based on policies that sequentially adapt over time. Our goal is counterfactual inference, i.e., estimate what would have happened if alternate policies were used, a problem that is inherently challenging due to the heterogeneity in the outcomes across units and time. To tackle this task, we introduce a suitable latent factor model where the potential outcomes are determined by exogenous unit and time level latent factors. Under suitable conditions, we show that it is possible to estimate the missing (potential) outcomes using a simple variant of nearest neighbors. First, assuming a bilinear latent factor model and allowing for an arbitrary adaptive sampling policy, we establish a distribution-free non-asymptotic guarantee for estimating the missing outcome of any unit at any time; under suitable regularity condition, this guarantee implies that our estimator is consistent. Second, for a generic non-parametric latent factor model, we establish that the estimate for the missing outcome of any unit at time 𝐓 satisfies a central limit theorem as 𝐓→∞, under suitable regularity conditions. Finally, en route to establishing this central limit theorem, we establish a non-asymptotic mean-squared-error bound for the estimate of the missing outcome of any unit at time 𝐓. Our work extends the recently growing literature on inference with adaptively collected data by allowing for policies that pool across units, and also compliments the matrix completion literature when the entries are revealed sequentially in an arbitrarily dependent manner based on prior observed data.
READ FULL TEXT