DeepAI AI Chat
Log In Sign Up

Active Testing: Sample-Efficient Model Evaluation

by   Jannik Kossen, et al.

We introduce active testing: a new framework for sample-efficient model evaluation. While approaches like active learning reduce the number of labels needed for model training, existing literature largely ignores the cost of labeling test data, typically unrealistically assuming large test sets for model evaluation. This creates a disconnect to real applications where test labels are important and just as expensive, e.g. for optimizing hyperparameters. Active testing addresses this by carefully selecting the test points to label, ensuring model evaluation is sample-efficient. To this end, we derive theoretically-grounded and intuitive acquisition strategies that are specifically tailored to the goals of active testing, noting these are distinct to those of active learning. Actively selecting labels introduces a bias; we show how to remove that bias while reducing the variance of the estimator at the same time. Active testing is easy to implement, effective, and can be applied to any supervised machine learning method. We demonstrate this on models including WideResNet and Gaussian processes on datasets including CIFAR-100.


page 1

page 2

page 3

page 4


Active Surrogate Estimators: An Active Learning Approach to Label-Efficient Model Evaluation

We propose Active Surrogate Estimators (ASEs), a new method for label-ef...

Active Learning with Weak Labels for Gaussian Processes

Annotating data for supervised learning can be costly. When the annotati...

Bayesian Active Learning with Fully Bayesian Gaussian Processes

The bias-variance trade-off is a well-known problem in machine learning ...

On Statistical Bias In Active Learning: How and When To Fix It

Active learning is a powerful tool when labelling data is expensive, but...

Active learning with RESSPECT: Resource allocation for extragalactic astronomical transients

The recent increase in volume and complexity of available astronomical d...

Robust online active learning

In many industrial applications, obtaining labeled observations is not s...

Prioritized training on points that are learnable, worth learning, and not yet learned

We introduce Goldilocks Selection, a technique for faster model training...