Distributionally Robust Optimization and Invariant Representation Learning for Addressing Subgroup Underrepresentation: Mechanisms and Limitations

by   Nilesh Kumar, et al.

Spurious correlation caused by subgroup underrepresentation has received increasing attention as a source of bias that can be perpetuated by deep neural networks (DNNs). Distributionally robust optimization has shown success in addressing this bias, although the underlying working mechanism mostly relies on upweighting under-performing samples as surrogates for those underrepresented in data. At the same time, while invariant representation learning has been a powerful choice for removing nuisance-sensitive features, it has been little considered in settings where spurious correlations are caused by significant underrepresentation of subgroups. In this paper, we take the first step to better understand and improve the mechanisms for debiasing spurious correlation due to subgroup underrepresentation in medical image classification. Through a comprehensive evaluation study, we first show that 1) generalized reweighting of under-performing samples can be problematic when bias is not the only cause for poor performance, while 2) naive invariant representation learning suffers from spurious correlations itself. We then present a novel approach that leverages robust optimization to facilitate the learning of invariant representations at the presence of spurious correlations. Finetuned classifiers utilizing such representation demonstrated improved abilities to reduce subgroup performance disparity, while maintaining high average and worst-group performance.


page 1

page 2

page 3

page 4


Improved Worst-Group Robustness via Classifier Retraining on Independent Splits

High-capacity deep neural networks (DNNs) trained with Empirical Risk Mi...

MedFACT: Modeling Medical Feature Correlations in Patient Health Representation Learning via Feature Clustering

In healthcare prediction tasks, it is essential to exploit the correlati...

Even Small Correlation and Diversity Shifts Pose Dataset-Bias Issues

Distribution shifts are common in real-world datasets and can affect the...

Learning Invariant Representation via Contrastive Feature Alignment for Clutter Robust SAR Target Recognition

The deep neural networks (DNNs) have freed the synthetic aperture radar ...

Overview of Scanner Invariant Representations

Pooled imaging data from multiple sources is subject to bias from each s...

Learning unbiased features

A key element in transfer learning is representation learning; if repres...

Fair Representation Learning through Implicit Path Alignment

We consider a fair representation learning perspective, where optimal pr...

Please sign up or login with your details

Forgot password? Click here to reset