A Curriculum Learning Method for Improved Noise Robustness in Automatic Speech Recognition

06/22/2016
by   Stefan Braun, et al.
0

The performance of automatic speech recognition systems under noisy environments still leaves room for improvement. Speech enhancement or feature enhancement techniques for increasing noise robustness of these systems usually add components to the recognition system that need careful optimization. In this work, we propose the use of a relatively simple curriculum training strategy called accordion annealing (ACCAN). It uses a multi-stage training schedule where samples at signal-to-noise ratio (SNR) values as low as 0dB are first added and samples at increasing higher SNR values are gradually added up to an SNR value of 50dB. We also use a method called per-epoch noise mixing (PEM) that generates noisy training samples online during training and thus enables dynamically changing the SNR of our training data. Both the ACCAN and the PEM methods are evaluated on a end-to-end speech recognition pipeline on the Wall Street Journal corpus. ACCAN decreases the average word error rate (WER) on the 20dB to -10dB SNR range by up to 31.4 conventional multi-condition training method.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/24/2022

Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation

Most automatic speech processing systems are sensitive to the acoustic e...
research
12/12/2021

Improving Speech Recognition on Noisy Speech via Speech Enhancement with Multi-Discriminators CycleGAN

This paper presents our latest investigations on improving automatic spe...
research
06/18/2019

Deep Xi as a Front-End for Robust Automatic Speech Recognition

Front-end techniques for robust automatic speech recognition (ASR) have ...
research
05/17/2022

Streaming Noise Context Aware Enhancement For Automatic Speech Recognition in Multi-Talker Environments

One of the most challenging scenarios for smart speakers is multi-talker...
research
10/04/2021

Building a Noisy Audio Dataset to Evaluate Machine Learning Approaches for Automatic Speech Recognition Systems

Automatic speech recognition systems are part of people's daily lives, e...
research
05/02/2018

Information Loss in the Human Auditory System

From the eardrum to the auditory cortex, where acoustic stimuli are deco...
research
07/26/2019

Correlation Distance Skip Connection Denoising Autoencoder (CDSK-DAE) for Speech Feature Enhancement

Performance of learning based Automatic Speech Recognition (ASR) is susc...

Please sign up or login with your details

Forgot password? Click here to reset