Learning Smooth and Fair Representations
Organizations that own data face increasing legal liability for its discriminatory use against protected demographic groups, extending to contractual transactions involving third parties access and use of the data. This is problematic, since the original data owner cannot ex-ante anticipate all its future uses by downstream users. This paper explores the upstream ability to preemptively remove the correlations between features and sensitive attributes by mapping features to a fair representation space. Our main result shows that the fairness measured by the demographic parity of the representation distribution can be certified from a finite sample if and only if the chi-squared mutual information between features and representations is finite. Empirically, we find that smoothing the representation distribution provides generalization guarantees of fairness certificates, which improves upon existing fair representation learning approaches. Moreover, we do not observe that smoothing the representation distribution degrades the accuracy of downstream tasks compared to state-of-the-art methods in fair representation learning.
READ FULL TEXT