Can Active Learning Preemptively Mitigate Fairness Issues?

Dataset bias is one of the prevailing causes of unfairness in machine learning. Addressing fairness at the data collection and dataset preparation stages therefore becomes an essential part of training fairer algorithms. In particular, active learning (AL) algorithms show promise for the task by drawing importance to the most informative training samples. However, the effect and interaction between existing AL algorithms and algorithmic fairness remain under-explored. In this paper, we study whether models trained with uncertainty-based AL heuristics such as BALD are fairer in their decisions with respect to a protected class than those trained with identically independently distributed (i.i.d.) sampling. We found a significant improvement on predictive parity when using BALD, while also improving accuracy compared to i.i.d. sampling. We also explore the interaction of algorithmic fairness methods such as gradient reversal (GRAD) and BALD. We found that, while addressing different fairness issues, their interaction further improves the results on most benchmarks and metrics we explored.


page 1

page 2

page 3

page 4


More Data Can Lead Us Astray: Active Data Acquisition in the Presence of Label Bias

An increased awareness concerning risks of algorithmic bias has driven a...

Fair Active Learning

Bias in training data and proxy attributes are probably the main reasons...

Coping with Mistreatment in Fair Algorithms

Machine learning actively impacts our everyday life in almost all endeav...

ABCinML: Anticipatory Bias Correction in Machine Learning Applications

The idealization of a static machine-learned model, trained once and dep...

Addressing Bias in Active Learning with Depth Uncertainty Networks... or Not

Farquhar et al. [2021] show that correcting for active learning bias wit...

Active Learning Methods for Efficient Hybrid Biophysical Variable Retrieval

Kernel-based machine learning regression algorithms (MLRAs) are potentia...

Pushing the Accuracy-Group Robustness Frontier with Introspective Self-play

Standard empirical risk minimization (ERM) training can produce deep neu...

Code Repositories


Library to enable Bayesian active learning in your research or labeling work.

view repo


Code for the paper "Can Active Learning Preemptively Mitigate Fairness Issues?" presented at RAI 2021.

view repo

Please sign up or login with your details

Forgot password? Click here to reset