Building High-accuracy Multilingual ASR with Gated Language Experts and Curriculum Training

03/01/2023
by   Eric Sun, et al.
0

We propose gated language experts to improve multilingual transformer transducer models without any language identification (LID) input from users during inference. We define gating mechanism and LID loss to let transformer encoders learn language-dependent information, construct the multilingual transformer block with gated transformer experts and shared transformer layers for compact models, and apply linear experts on joint network output to better regularize speech acoustic and token label joint information. Furthermore, a curriculum training scheme is proposed to let LID guide the gated language experts for better serving their corresponding languages. Evaluated on the English and Spanish bilingual task, our methods achieve average 12.5 relative word error reductions over the baseline bilingual model and monolingual models, respectively, obtaining similar results to the upper bound model trained and inferred with oracle LID. We further explore our method on trilingual, quadrilingual, and pentalingual models, and observe similar advantages as in the bilingual models, which demonstrates the easy extension to more languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/10/2021

Building a great multi-lingual teacher with sparsely-gated mixture of experts for speech recognition

The sparsely-gated Mixture of Experts (MoE) can magnify a network capaci...
research
07/12/2023

Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition

Multilingual speech recognition for both monolingual and code-switching ...
research
09/08/2022

Multilingual Transformer Language Model for Speech Recognition in Low-resource Languages

It is challenging to train and deploy Transformer LMs for hybrid speech ...
research
06/17/2019

Adversarial Training for Multilingual Acoustic Modeling

Multilingual training has been shown to improve acoustic modeling perfor...
research
02/17/2023

Massively Multilingual Shallow Fusion with Large Language Models

While large language models (LLM) have made impressive progress in natur...
research
05/05/2023

Neuromodulation Gated Transformer

We introduce a novel architecture, the Neuromodulation Gated Transformer...
research
07/08/2020

Streaming End-to-End Bilingual ASR Systems with Joint Language Identification

Multilingual ASR technology simplifies model training and deployment, bu...

Please sign up or login with your details

Forgot password? Click here to reset