Systematic analysis of the impact of label noise correction on ML Fairness

06/28/2023
by   I. Oliveira e Silva, et al.
0

Arbitrary, inconsistent, or faulty decision-making raises serious concerns, and preventing unfair models is an increasingly important challenge in Machine Learning. Data often reflect past discriminatory behavior, and models trained on such data may reflect bias on sensitive attributes, such as gender, race, or age. One approach to developing fair models is to preprocess the training data to remove the underlying biases while preserving the relevant information, for example, by correcting biased labels. While multiple label noise correction methods are available, the information about their behavior in identifying discrimination is very limited. In this work, we develop an empirical methodology to systematically evaluate the effectiveness of label noise correction techniques in ensuring the fairness of models trained on biased datasets. Our methodology involves manipulating the amount of label noise and can be used with fairness benchmarks but also with standard ML datasets. We apply the methodology to analyze six label noise correction methods according to several fairness metrics on standard OpenML datasets. Our results suggest that the Hybrid Label Noise Correction method achieves the best trade-off between predictive performance and fairness. Clustering-Based Correction can reduce discrimination the most, however, at the cost of lower predictive performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/04/2019

Auditing and Achieving Intersectional Fairness in Classification Problems

Machine learning algorithms are extensively used to make increasingly mo...
research
05/02/2023

On the Impact of Data Quality on Image Classification Fairness

With the proliferation of algorithmic decision-making, increased scrutin...
research
04/25/2023

Fairness and Bias in Truth Discovery Algorithms: An Experimental Analysis

Machine learning (ML) based approaches are increasingly being used in a ...
research
02/05/2021

Removing biased data to improve fairness and accuracy

Machine learning systems are often trained using data collected from his...
research
02/26/2020

DeBayes: a Bayesian method for debiasing network embeddings

As machine learning algorithms are increasingly deployed for high-impact...
research
12/05/2019

Perfectly Parallel Fairness Certification of Neural Networks

Recently, there is growing concern that machine-learning models, which c...
research
04/30/2022

Fair Feature Subset Selection using Multiobjective Genetic Algorithm

The feature subset selection problem aims at selecting the relevant subs...

Please sign up or login with your details

Forgot password? Click here to reset