Learning Diverse Representations for Fast Adaptation to Distribution Shift

06/12/2020
by   Daniel Pace, et al.
10

The i.i.d. assumption is a useful idealization that underpins many successful approaches to supervised machine learning. However, its violation can lead to models that learn to exploit spurious correlations in the training data, rendering them vulnerable to adversarial interventions, undermining their reliability, and limiting their practical application. To mitigate this problem, we present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task. We propose a notion of diversity based on minimizing the conditional total correlation of final layer representations across models given the label, which we approximate using a variational estimator and minimize using adversarial training. To demonstrate our framework's ability to facilitate rapid adaptation to distribution shift, we train a number of simple classifiers from scratch on the frozen outputs of our models using a small amount of data from the shifted distribution. Under this evaluation protocol, our framework significantly outperforms a baseline trained using the empirical risk minimization principle.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/18/2022

On the Connection between Invariant Learning and Adversarial Training for Out-of-Distribution Generalization

Despite impressive success in many tasks, deep learning models are shown...
research
03/15/2019

On Target Shift in Adversarial Domain Adaptation

Discrepancy between training and testing domains is a fundamental proble...
research
11/07/2016

Revisiting Distributionally Robust Supervised Learning in Classification

Distributionally Robust Supervised Learning (DRSL) is necessary for buil...
research
05/30/2023

ELSA: Efficient Label Shift Adaptation through the Lens of Semiparametric Models

We study the domain adaptation problem with label shift in this work. Un...
research
02/09/2021

Provable Defense Against Delusive Poisoning

Delusive poisoning is a special kind of attack to obstruct learning, whe...
research
05/26/2022

Understanding new tasks through the lens of training data via exponential tilting

Deploying machine learning models to new tasks is a major challenge desp...
research
04/06/2022

Last Layer Re-Training is Sufficient for Robustness to Spurious Correlations

Neural network classifiers can largely rely on simple spurious features,...

Please sign up or login with your details

Forgot password? Click here to reset