Statistical Methods for Assessing Differences in False Non-Match Rates Across Demographic Groups

by   Michael Schuckers, et al.

Biometric recognition is used across a variety of applications from cyber security to border security. Recent research has focused on ensuring biometric performance (false negatives and false positives) is fair across demographic groups. While there has been significant progress on the development of metrics, the evaluation of the performance across groups, and the mitigation of any problems, there has been little work incorporating statistical variation. This is important because differences among groups can be found by chance when no difference is present. In statistics this is called a Type I error. Differences among groups may be due to sampling variation or they may be due to actual difference in system performance. Discriminating between these two sources of error is essential for good decision making about fairness and equity. This paper presents two novel statistical approaches for assessing fairness across demographic groups. The first methodology is a bootstrapped-based hypothesis test, while the second is simpler test methodology focused upon non-statistical audience. For the latter we present the results of a simulation study about the relationship between the margin of error and factors such as number of subjects, number of attempts, correlation between attempts, underlying false non-match rates(FNMR's), and number of groups.


page 1

page 2

page 3

page 4


Fairness Index Measures to Evaluate Bias in Biometric Recognition

The demographic disparity of biometric systems has led to serious concer...

Investigating Fairness of Ocular Biometrics Among Young, Middle-Aged, and Older Adults

A number of studies suggest bias of the face biometrics, i.e., face reco...

The More Secure, The Less Equally Usable: Gender and Ethnicity (Un)fairness of Deep Face Recognition along Security Thresholds

Face biometrics are playing a key role in making modern smart city appli...

Statistical Inference for Fairness Auditing

Before deploying a black-box model in high-stakes problems, it is import...

How to analyze data in a factorial design? An extensive simulation study

Factorial designs are frequently used in different fields of science, e....

Machine Learning Fairness in Justice Systems: Base Rates, False Positives, and False Negatives

Machine learning best practice statements have proliferated, but there i...

Violent Crime in London: An Investigation using Geographically Weighted Regression

Violent crime in London is an area of increasing interest following poli...

Please sign up or login with your details

Forgot password? Click here to reset