Attentive Adversarial Learning for Domain-Invariant Training

04/28/2019
by   Zhong Meng, et al.
0

Adversarial domain-invariant training (ADIT) proves to be effective in suppressing the effects of domain variability in acoustic modeling and has led to improved performance in automatic speech recognition (ASR). In ADIT, an auxiliary domain classifier takes in equally-weighted deep features from a deep neural network (DNN) acoustic model and is trained to improve their domain-invariance by optimizing an adversarial loss function. In this work, we propose an attentive ADIT (AADIT) in which we advance the domain classifier with an attention mechanism to automatically weight the input deep features according to their importance in domain classification. With this attentive re-weighting, AADIT can focus on the domain normalization of phonetic components that are more susceptible to domain variability and generates deep features with improved domain-invariance and senone-discriminativity over ADIT. Most importantly, the attention block serves only as an external component to the DNN acoustic model and is not involved in ASR, so AADIT can be used to improve the acoustic modeling with any DNN architectures. More generally, the same methodology can improve any adversarial learning system with an auxiliary discriminator. Evaluated on CHiME-3 dataset, the AADIT achieves 13.6 relative WER improvements, respectively, over a multi-conditional model and a strong ADIT baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/21/2018

Adversarial Learning of Raw Speech Features for Domain Invariant Speech Recognition

Recent advances in neural network based acoustic modelling have shown si...
research
04/02/2018

Speaker-Invariant Training via Adversarial Learning

We propose a novel adversarial multi-task learning scheme, aiming at act...
research
04/29/2019

Adversarial Speaker Adaptation

We propose a novel adversarial speaker adaptation (ASA) scheme, in which...
research
11/21/2017

Unsupervised Adaptation with Domain Separation Networks for Robust Speech Recognition

Unsupervised domain adaptation of speech signal aims at adapting a well-...
research
01/01/2020

Attentive batch normalization for lstm-based acoustic modeling of speech recognition

Batch normalization (BN) is an effective method to accelerate model trai...
research
10/03/2022

Efficient acoustic feature transformation in mismatched environments using a Guided-GAN

We propose a new framework to improve automatic speech recognition (ASR)...
research
05/06/2022

A Conformer-based Waveform-domain Neural Acoustic Echo Canceller Optimized for ASR Accuracy

Acoustic Echo Cancellation (AEC) is essential for accurate recognition o...

Please sign up or login with your details

Forgot password? Click here to reset