Two-sample Testing Using Deep Learning

10/14/2019
by   Matthias Kirchler, et al.
0

We propose a two-sample testing procedure based on learned deep neural network representations. To this end, we define two test statistics that perform an asymptotic location test on data samples mapped onto a hidden layer. The tests are consistent and asymptotically control the type-1 error rate. Their test statistics can be evaluated in linear time (in the sample size). Suitable data representations are obtained in a data-driven way, by solving a supervised or unsupervised transfer-learning task on an auxiliary (potentially distinct) data set. If no auxiliary data is available, we split the data into two chunks: one for learning representations and one for computing the test statistic. In experiments on audio samples, natural images and three-dimensional neuroimaging data our tests yield significant decreases in type-2 error rate (up to 35 percentage points) compared to state-of-the-art two-sample tests such as kernel-methods and classifier two-sample tests.

READ FULL TEXT
research
02/10/2021

An Optimal Witness Function for Two-Sample Testing

We propose data-dependent test statistics based on a one-dimensional wit...
research
08/25/2020

Are You All Normal? It Depends!

The assumption of normality has underlain much of the development of sta...
research
08/01/2018

A Differentially Private Kernel Two-Sample Test

Kernel two-sample testing is a useful statistical tool in determining wh...
research
10/29/2021

Multiple-Splitting Projection Test for High-Dimensional Mean Vectors

We propose a multiple-splitting projection test (MPT) for one-sample mea...
research
10/25/2022

Topology-Driven Goodness-of-Fit Tests in Arbitrary Dimensions

This paper adopts a tool from computational topology, the Euler characte...
research
06/14/2021

Meta Two-Sample Testing: Learning Kernels for Testing with Limited Data

Modern kernel-based two-sample tests have shown great success in disting...
research
02/24/2020

Optimizing effective numbers of tests by vine copula modeling

In the multiple testing context, we utilize vine copulae for optimizing ...

Please sign up or login with your details

Forgot password? Click here to reset