Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data

05/16/2022
by   Alëna Aksënova, et al.
0

Building inclusive speech recognition systems is a crucial step towards developing technologies that speakers of all language varieties can use. Therefore, ASR systems must work for everybody independently of the way they speak. To accomplish this goal, there should be available data sets representing language varieties, and also an understanding of model configuration that is the most helpful in achieving robust understanding of all types of speech. However, there are not enough data sets for accented speech, and for the ones that are already available, more training approaches need to be explored to improve the quality of accented speech recognition. In this paper, we discuss recent progress towards developing more inclusive ASR systems, namely, the importance of building new data sets representing linguistic diversity, and exploring novel training approaches to improve performance for all users. We address recent directions within benchmarking ASR systems for accented speech, measure the effects of wav2vec 2.0 pre-training on accented speech recognition, and highlight corpora relevant for diverse ASR evaluations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/04/2023

Boosting Norwegian Automatic Speech Recognition

In this paper, we present several baselines for automatic speech recogni...
research
03/07/2022

Building and curating conversational corpora for diversity-aware language science and technology

We present a pipeline and tools to build a maximally natural data set of...
research
01/15/2022

Recent Progress in the CUHK Dysarthric Speech Recognition System

Despite the rapid progress of automatic speech recognition (ASR) technol...
research
10/12/2019

VAIS ASR: Building a conversational speech recognition system using language model combination

Automatic Speech Recognition (ASR) systems have been evolving quickly an...
research
03/22/2020

Training for Speech Recognition on Coprocessors

Automatic Speech Recognition (ASR) has increased in popularity in recent...
research
02/24/2022

Ask2Mask: Guided Data Selection for Masked Speech Modeling

Masked speech modeling (MSM) methods such as wav2vec2 or w2v-BERT learn ...

Please sign up or login with your details

Forgot password? Click here to reset