Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural Transducers

12/07/2022
by   Zijian Yang, et al.
0

Recently, RNN-Transducers have achieved remarkable results on various automatic speech recognition tasks. However, lattice-free sequence discriminative training methods, which obtain superior performance in hybrid modes, are rarely investigated in RNN-Transducers. In this work, we propose three lattice-free training objectives, namely lattice-free maximum mutual information, lattice-free segment-level minimum Bayes risk, and lattice-free minimum Bayes risk, which are used for the final posterior output of the phoneme-based neural transducer with a limited context dependency. Compared to criteria using N-best lists, lattice-free methods eliminate the decoding step for hypotheses generation during training, which leads to more efficient training. Experimental results show that lattice-free methods gain up to 6.5 relative improvement in word error rate compared to a sequence-level cross-entropy trained model. Compared to the N-best-list based minimum Bayes risk objectives, lattice-free methods gain 40 speedup with a small degradation in performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/08/2018

A Comparison of Lattice-free Discriminative Training Criteria for Purely Sequence-Trained Neural Network Acoustic Models

In this work, three lattice-free (LF) discriminative training criteria f...
research
07/01/2019

Comparison of Lattice-Free and Lattice-Based Sequence Discriminative Training Criteria for LVCSR

Sequence discriminative training criteria have long been a standard tool...
research
10/26/2018

A novel pyramidal-FSMN architecture with lattice-free MMI for speech recognition

Deep Feedforward Sequential Memory Network (DFSMN) has shown superior pe...
research
12/05/2021

Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI

Recently, End-to-End (E2E) frameworks have achieved remarkable results o...
research
11/28/2019

Minimum Bayes Risk Training of RNN-Transducer for End-to-End Speech Recognition

In this work, we propose minimum Bayes risk (MBR) training of RNN-Transd...
research
03/29/2022

Integrate Lattice-Free MMI into End-to-End Speech Recognition

In automatic speech recognition (ASR) research, discriminative criteria ...
research
06/27/2019

Lattice-Based Unsupervised Test-Time Adaptation of Neural Network Acoustic Models

Acoustic model adaptation to unseen test recordings aims to reduce the m...

Please sign up or login with your details

Forgot password? Click here to reset