Statistical Testing under Distributional Shifts

05/22/2021
by   Nikolaj Thams, et al.
0

Statistical hypothesis testing is a central problem in empirical inference. Observing data from a distribution P^*, one is interested in the hypothesis P^* ∈ H_0 and requires any test to control the probability of false rejections. In this work, we introduce statistical testing under distributional shifts. We are still interested in a target hypothesis P^* ∈ H_0, but observe data from a distribution Q^* in an observational domain. We assume that P^* is related to Q^* through a known shift τ and formally introduce a framework for hypothesis testing in this setting. We propose a general testing procedure that first resamples from the n observed data points to construct an auxiliary data set (mimicking properties of P^*) and then applies an existing test in the target domain. We prove that this procedure holds pointwise asymptotic level – if the target test holds pointwise asymptotic level, the size of the resample is at most o(√(n)), and the resampling weights are well-behaved. We further show that if the map τ is unknown, it can, under mild conditions, be estimated from data, maintaining level guarantees. Testing under distributional shifts allows us to tackle a diverse set of problems. We argue that it may prove useful in reinforcement learning, we show how it reduces conditional to unconditional independence testing and we provide example applications in causal inference. Code is easy-to-use and will be available online.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/15/2020

Distributional Null Hypothesis Testing with the T distribution

Null Hypothesis Significance Testing (NHST) has long been central to the...
research
12/23/2020

Testing whether a Learning Procedure is Calibrated

A learning procedure takes as input a dataset and performs inference for...
research
02/27/2020

A Distributional Framework for Data Valuation

Shapley value is a classic notion from game theory, historically used to...
research
09/07/2020

New Upper Bounds in the Hypothesis Testing Problem with Information Constraints

We consider a hypothesis testing problem where a part of data cannot be ...
research
06/05/2023

Inference under constrained distribution shifts

Large-scale administrative or observational datasets are increasingly us...
research
06/20/2019

On the probability of a causal inference is robust for internal validity

The internal validity of observational study is often subject to debate....
research
04/16/2019

Scalable and Efficient Hypothesis Testing with Random Forests

Throughout the last decade, random forests have established themselves a...

Please sign up or login with your details

Forgot password? Click here to reset