Investigating Methods to Improve Language Model Integration for Attention-based Encoder-Decoder ASR Models

04/12/2021
by   Mohammad Zeineldeen, et al.
13

Attention-based encoder-decoder (AED) models learn an implicit internal language model (ILM) from the training transcriptions. The integration with an external LM trained on much more unpaired text usually leads to better performance. A Bayesian interpretation as in the hybrid autoregressive transducer (HAT) suggests dividing by the prior of the discriminative acoustic model, which corresponds to this implicit LM, similarly as in the hybrid hidden Markov model approach. The implicit LM cannot be calculated efficiently in general and it is yet unclear what are the best methods to estimate it. In this work, we compare different approaches from the literature and propose several novel methods to estimate the ILM directly from the AED model. Our proposed methods outperform all previous approaches. We also investigate other methods to suppress the ILM mainly by decreasing the capacity of the AED model, limiting the label context, and also by training the AED model together with a pre-existing LM.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/14/2023

Hybrid Attention-based Encoder-decoder Model for Efficient Language Model Adaptation

Attention-based encoder-decoder (AED) speech recognition model has been ...
research
01/26/2022

Internal language model estimation through explicit context vector learning for attention-based encoder-decoder ASR

An end-to-end (E2E) speech recognition model implicitly learns a biased ...
research
05/20/2020

Early Stage LM Integration Using Local and Global Log-Linear Combination

Sequence-to-sequence models with an implicit alignment mechanism (e.g. a...
research
01/24/2022

Improving Factored Hybrid HMM Acoustic Modeling without State Tying

In this work, we show that a factored hybrid hidden Markov model (FH-HMM...
research
05/05/2023

Mask The Bias: Improving Domain-Adaptive Generalization of CTC-based ASR with Internal Language Model Estimation

End-to-end ASR models trained on large amount of data tend to be implici...
research
07/27/2018

A Comparison of Techniques for Language Model Integration in Encoder-Decoder Speech Recognition

Attention-based recurrent neural encoder-decoder models present an elega...
research
10/13/2021

On Language Model Integration for RNN Transducer based Speech Recognition

The mismatch between an external language model (LM) and the implicitly ...

Please sign up or login with your details

Forgot password? Click here to reset