Feature-Wise Bias Amplification

12/21/2018
by   Klas Leino, et al.
0

We study the phenomenon of bias amplification in classifiers, wherein a machine learning model learns to predict classes with a greater disparity than the underlying ground truth. We demonstrate that bias amplification can arise via an inductive bias in gradient descent methods that results in the overestimation of the importance of moderately-predictive "weak" features if insufficient training data is available. This overestimation gives rise to feature-wise bias amplification -- a previously unreported form of bias that can be traced back to the features of a trained model. Through analysis and experiments, we show that while some bias cannot be mitigated without sacrificing accuracy, feature-wise bias amplification can be mitigated through targeted feature selection. We present two new feature selection algorithms for mitigating bias amplification in linear models, and show how they can be adapted to convolutional neural networks efficiently. Our experiments on synthetic and real data demonstrate that these algorithms consistently lead to reduced bias without harming accuracy, in some cases eliminating predictive bias altogether while providing modest gains in accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/06/2022

DIWIFT: Discovering Instance-wise Influential Features for Tabular Data

Tabular data is one of the most common data storage formats in business ...
research
06/20/2023

The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks

We study the implicit bias of batch normalization trained by gradient de...
research
10/26/2020

Feature Selection Using Batch-Wise Attenuation and Feature Mask Normalization

Feature selection is generally used as one of the most important pre-pro...
research
03/22/2021

Detecting Racial Bias in Jury Selection

To support the 2019 U.S. Supreme Court case "Flowers v. Mississippi", AP...
research
12/11/2002

Technical Note: Bias and the Quantification of Stability

Research on bias in machine learning algorithms has generally been conce...
research
04/16/2019

REPAIR: Removing Representation Bias by Dataset Resampling

Modern machine learning datasets can have biases for certain representat...
research
06/21/2023

ProtoGate: Prototype-based Neural Networks with Local Feature Selection for Tabular Biomedical Data

Tabular biomedical data poses challenges in machine learning because it ...

Please sign up or login with your details

Forgot password? Click here to reset