Probably Approximately Metric-Fair Learning
We study fairness in machine learning. A learning algorithm, given a training set drawn from an underlying population, learns a classifier that will be used to make decisions about individuals. The concern is that this classifier's decisions might be discriminatory, favoring certain subpopulations over others. The seminal work of Dwork et al. [ITCS 2012] introduced fairness through awareness, positing that a fair classifier should treat similar individuals similarly. Similarity between individuals is measured by a task-specific similarity metric. In the context of machine learning, however, this fairness notion faces serious difficulties, as it does not generalize and can be computationally intractable. We introduce a relaxed notion of approximate metric-fairness, which allows a small fairness error: for a random pair of individuals sampled from the population, with all but a small probability of error, if they are similar then they are treated similarly. In particular, this provides discrimination-protections to every subpopulation that is not too small. We show that approximate metric-fairness does generalize from a training set to the underlying population, and we leverage these generalization guarantees to construct polynomial-time learning algorithms that achieve competitive accuracy subject to fairness constraints.
READ FULL TEXT