AutoML Two-Sample Test

06/17/2022
by   Jonas M. Kübler, et al.
10

Two-sample tests are important in statistics and machine learning, both as tools for scientific discovery as well as to detect distribution shifts. This led to the development of many sophisticated test procedures going beyond the standard supervised learning frameworks, whose usage can require specialized knowledge about two-sample testing. We use a simple test that takes the mean discrepancy of a witness function as the test statistic and prove that minimizing a squared loss leads to a witness with optimal testing power. This allows us to leverage recent advancements in AutoML. Without any user input about the problems at hand, and using the same method for all our experiments, our AutoML two-sample test achieves competitive performance on a diverse distribution shift benchmark as well as on challenging two-sample testing problems. We provide an implementation of the AutoML two-sample test in the Python package autotst.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/10/2021

An Optimal Witness Function for Two-Sample Testing

We propose data-dependent test statistics based on a one-dimensional wit...
research
07/02/2020

A New ECDF Two-Sample Test Statistic

Empirical cumulative distribution functions (ECDFs) have been used to te...
research
03/13/2020

Two-Sample High Dimensional Mean Test Based On Prepivots

Testing equality of mean vectors is a very commonly used criterion when ...
research
11/19/2021

Maximum Mean Discrepancy for Generalization in the Presence of Distribution and Missingness Shift

Covariate shifts are a common problem in predictive modeling on real-wor...
research
12/23/2019

Study on upper limit of sample sizes for a two-level test in NIST SP800-22

NIST SP800-22 is one of the widely used statistical testing tools for ps...
research
06/06/2021

Neural Tangent Kernel Maximum Mean Discrepancy

We present a novel neural network Maximum Mean Discrepancy (MMD) statist...
research
06/21/2022

Sharp Constants in Uniformity Testing via the Huber Statistic

Uniformity testing is one of the most well-studied problems in property ...

Please sign up or login with your details

Forgot password? Click here to reset