Improved Self-Supervised Multilingual Speech Representation Learning Combined with Auxiliary Language Information

12/07/2022
by   Fenglin Ding, et al.
0

Multilingual end-to-end models have shown great improvement over monolingual systems. With the development of pre-training methods on speech, self-supervised multilingual speech representation learning like XLSR has shown success in improving the performance of multilingual automatic speech recognition (ASR). However, similar to the supervised learning, multilingual pre-training may also suffer from language interference and further affect the application of multilingual system. In this paper, we introduce several techniques for improving self-supervised multilingual pre-training by leveraging auxiliary language information, including the language adversarial training, language embedding and language adaptive training during the pre-training stage. We conduct experiments on a multilingual ASR task consisting of 16 languages. Our experimental results demonstrate 14.3 gain over the standard XLSR model, and 19.8 pre-training multilingual model.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/17/2022

Deploying self-supervised learning in the wild for hybrid automatic speech recognition

Self-supervised learning (SSL) methods have proven to be very successful...
research
10/15/2021

Multilingual Speech Recognition using Knowledge Transfer across Learning Processes

Multilingual end-to-end(E2E) models have shown a great potential in the ...
research
05/21/2023

Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages

Recent models such as XLS-R and Whisper have made multilingual speech te...
research
02/24/2023

Improving Massively Multilingual ASR With Auxiliary CTC Objectives

Multilingual Automatic Speech Recognition (ASR) models have extended the...
research
11/15/2021

Joint Unsupervised and Supervised Training for Multilingual ASR

Self-supervised training has shown promising gains in pretraining models...
research
02/01/2021

SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of Multilingual BERT models for Offensive Language Identification

In this paper we present our submission for the EACL 2021-Shared Task on...
research
03/19/2021

Let Your Heart Speak in its Mother Tongue: Multilingual Captioning of Cardiac Signals

Cardiac signals, such as the electrocardiogram, convey a significant amo...

Please sign up or login with your details

Forgot password? Click here to reset