Unsupervised Domain Adaptation for Robust Speech Recognition via Variational Autoencoder-Based Data Augmentation

07/19/2017
by   Wei-Ning Hsu, et al.
0

Domain mismatch between training and testing can lead to significant degradation in performance in many machine learning scenarios. Unfortunately, this is not a rare situation for automatic speech recognition deployments in real-world applications. Research on robust speech recognition can be regarded as trying to overcome this domain mismatch issue. In this paper, we address the unsupervised domain adaptation problem for robust speech recognition, where both source and target domain speech are presented, but word transcripts are only available for the source domain speech. We present novel augmentation-based methods that transform speech in a way that does not change the transcripts. Specifically, we first train a variational autoencoder on both source and target domain data (without supervision) to learn a latent representation of speech. We then transform nuisance attributes of speech that are irrelevant to recognition by modifying the latent representations, in order to augment labeled training data with additional data whose distribution is more similar to the target domain. The proposed method is evaluated on the CHiME-4 dataset and reduces the absolute word error rate (WER) by as much as 35

READ FULL TEXT
research
08/04/2021

Unsupervised Domain Adaptation in Speech Recognition using Phonetic Features

Automatic speech recognition is a difficult problem in pattern recogniti...
research
03/07/2018

Extracting Domain Invariant Features by Unsupervised Learning for Robust Automatic Speech Recognition

The performance of automatic speech recognition (ASR) systems can be sig...
research
06/13/2018

Unsupervised Adaptation with Interpretable Disentangled Representations for Distant Conversational Speech Recognition

The current trend in automatic speech recognition is to leverage large a...
research
07/27/2021

Unsupervised Domain Adaptation for Hate Speech Detection Using a Data Augmentation Approach

Online harassment in the form of hate speech has been on the rise in rec...
research
12/31/2022

Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek

Modern speech recognition systems exhibits rapid performance degradation...
research
02/22/2023

MADI: Inter-domain Matching and Intra-domain Discrimination for Cross-domain Speech Recognition

End-to-end automatic speech recognition (ASR) usually suffers from perfo...
research
06/18/2021

Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization

Dysarthric speech detection (DSD) systems aim to detect characteristics ...

Please sign up or login with your details

Forgot password? Click here to reset