AequeVox: Automated Fairness Testing of Speech Recognition Systems

by   Sai Sathiesh Rajan, et al.

Automatic Speech Recognition (ASR) systems have become ubiquitous. They can be found in a variety of form factors and are increasingly important in our daily lives. As such, ensuring that these systems are equitable to different subgroups of the population is crucial. In this paper, we introduce, AequeVox, an automated testing framework for evaluating the fairness of ASR systems. AequeVox simulates different environments to assess the effectiveness of ASR systems for different populations. In addition, we investigate whether the chosen simulations are comprehensible to humans. We further propose a fault localization technique capable of identifying words that are not robust to these varying environments. Both components of AequeVox are able to operate in the absence of ground truth data. We evaluated AequeVox on speech from four different datasets using three different commercial ASRs. Our experiments reveal that non-native English, female and Nigerian English speakers generate 109 errors, on average than native English, male and UK Midlands speakers, respectively. Our user study also reveals that 82.9 (employed through speech transformations) had a comprehensibility rating above seven (out of ten), with the lowest rating being 6.78. This further validates the fairness violations discovered by AequeVox. Finally, we show that the non-robust words, as predicted by the fault localization technique embodied in AequeVox, show 223.8 ASRs.


page 29

page 30

page 31


AccentDB: A Database of Non-Native English Accents to Assist Neural Speech Recognition

Modern Automatic Speech Recognition (ASR) technology has evolved to iden...

Svarah: Evaluating English ASR Systems on Indian Accents

India is the second largest English-speaking country in the world with a...

Synthetic Cross-accent Data Augmentation for Automatic Speech Recognition

The awareness for biased ASR datasets or models has increased notably in...

Incorporating L2 Phonemes Using Articulatory Features for Robust Speech Recognition

The limited availability of non-native speech datasets presents a major ...

Model-Based Approach for Measuring the Fairness in ASR

The issue of fairness arises when the automatic speech recognition (ASR)...

Exploring Automated Essay Scoring for Nonnative English Speakers

Automated Essay Scoring (AES) has been quite popular and is being widely...

Tensor models for linguistics pitch curve data of native speakers of Afrikaans

We use tensor analysis techniques for high-dimensional data to gain insi...

Please sign up or login with your details

Forgot password? Click here to reset