Improved Language Identification Through Cross-Lingual Self-Supervised Learning

07/08/2021
by   Andros Tjandra, et al.
0

Language identification greatly impacts the success of downstream tasks such as automatic speech recognition. Recently, self-supervised speech representations learned by wav2vec 2.0 have been shown to be very effective for a range of speech tasks. We extend previous self-supervised work on language identification by experimenting with pre-trained models which were learned on real-world unconstrained speech in multiple languages and not just on English. We show that models pre-trained on many languages perform better and enable language identification systems that require very little labeled data to perform well. Results on a 25 languages setup show that with only 10 minutes of labeled data per language, a cross-lingually pre-trained model can achieve over 93

READ FULL TEXT
research
02/07/2022

Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition

Self-supervised learning (SSL) is a powerful tool that allows learning o...
research
06/24/2022

Predicting within and across language phoneme recognition performance of self-supervised learning speech pre-trained models

In this work, we analyzed and compared speech representations extracted ...
research
06/13/2023

Efficient Adapters for Giant Speech Models

Large pre-trained speech models are widely used as the de-facto paradigm...
research
09/06/2023

Self-Supervised Masked Digital Elevation Models Encoding for Low-Resource Downstream Tasks

The lack of quality labeled data is one of the main bottlenecks for trai...
research
06/20/2020

Embodied Self-supervised Learning by Coordinated Sampling and Training

Self-supervised learning can significantly improve the performance of do...
research
08/24/2022

IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian languages

A cornerstone in AI research has been the creation and adoption of stand...
research
10/14/2022

Improving generalizability of distilled self-supervised speech processing models under distorted settings

Self-supervised learned (SSL) speech pre-trained models perform well acr...

Please sign up or login with your details

Forgot password? Click here to reset