kd-switch: A Universal Online Predictor with an application to Sequential Two-Sample Testing

01/23/2019
by   Alix Lheritier, et al.
0

We propose a novel online predictor for discrete labels conditioned on multivariate features in R^d. The predictor is pointwise universal: it achieves a normalized log loss performance asymptotically as good as the true conditional entropy of the labels given the features. The predictor is based on a feature space discretization induced by a full-fledged k-d tree with randomly picked directions and a switch distribution, requiring no hyperparameter setting and automatically selecting the most relevant scales in the feature space. Using recent results, a consistent sequential two-sample test is built from this predictor. In terms of discrimination power, on selected challenging datasets, it is comparable to or better than state-of-the-art non-sequential two-sample tests based on the train-test paradigm and, a recent sequential test requiring hyperparameters. The time complexity to process the n-th sample point is O( n) in probability (with respect to the distribution generating the data points), in contrast to the linear complexity of the previous sequential approach.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/26/2022

Anytime Valid Tests of Conditional Independence Under Model-X

We propose a sequential, anytime valid method to test the conditional in...
research
01/30/2023

Active Sequential Two-Sample Testing

Two-sample testing tests whether the distributions generating two sample...
research
07/19/2023

Spuriosity Didn't Kill the Classifier: Using Invariant Predictions to Harness Spurious Features

To avoid failures on out-of-distribution data, recent works have sought ...
research
10/08/2016

A nonparametric sequential test for online randomized experiments

We propose a nonparametric sequential test that aims to address two prac...
research
01/21/2017

Linear-Complexity Exponentially-Consistent Tests for Universal Outlying Sequence Detection

The problem of universal outlying sequence detection is studied, where t...
research
04/29/2023

Sequential Predictive Two-Sample and Independence Testing

We study the problems of sequential nonparametric two-sample and indepen...
research
11/04/2021

Testing using Privileged Information by Adapting Features with Statistical Dependence

Given an imperfect predictor, we exploit additional features at test tim...

Please sign up or login with your details

Forgot password? Click here to reset