Predictive Modeling in the Presence of Nuisance-Induced Spurious Correlations

by   Aahlad Puli, et al.

Deep predictive models often make use of spurious correlations between the label and the covariates that differ between training and test distributions. In many classification tasks, spurious correlations are induced by a changing relationship between the label and some nuisance variables correlated with the covariates. For example, in classifying animals in natural images, the background, which is the nuisance, can predict the type of animal. This nuisance-label relationship does not always hold. We formalize a family of distributions that only differ in the nuisance-label relationship and introduce a distribution where this relationship is broken called the nuisance-randomized distribution. We introduce a set of predictive models built from the nuisance-randomized distribution with representations, that when conditioned on, do not correlate the label and the nuisance. For models in this set, we lower bound the performance for any member of the family with the mutual information between the representation and the label under the nuisance-randomized distribution. To build predictive models that maximize the performance lower bound, we develop Nuisance-Randomized Distillation (NURD). We evaluate NURD on a synthetic example, colored-MNIST, and classifying chest X-rays. When using non-lung patches as the nuisance in classifying chest X-rays, NURD produces models that predict pneumonia under strong spurious correlations.


page 9

page 23


Nuisances via Negativa: Adjusting for Spurious Correlations via Data Augmentation

There exist features that are related to the label in the same way acros...

Robustness to Spurious Correlations Improves Semantic Out-of-Distribution Detection

Methods which utilize the outputs or feature representations of predicti...

Multi-label Contrastive Predictive Coding

Variational mutual information (MI) estimators are widely used in unsupe...

Complexity of randomized algorithms for underdamped Langevin dynamics

We establish an information complexity lower bound of randomized algorit...

Deconfounded Training for Graph Neural Networks

Learning powerful representations is one central theme of graph neural n...

Robustness to Spurious Correlations via Human Annotations

The reliability of machine learning systems critically assumes that the ...

Please sign up or login with your details

Forgot password? Click here to reset