Improving Accent Identification and Accented Speech Recognition Under a Framework of Self-supervised Learning

09/15/2021
by   Keqi Deng, et al.
0

Recently, self-supervised pre-training has gained success in automatic speech recognition (ASR). However, considering the difference between speech accents in real scenarios, how to identify accents and use accent features to improve ASR is still challenging. In this paper, we employ the self-supervised pre-training method for both accent identification and accented speech recognition tasks. For the former task, a standard deviation constraint loss (SDC-loss) based end-to-end (E2E) architecture is proposed to identify accents under the same language. As for accented speech recognition task, we design an accent-dependent ASR system, which can utilize additional accent input features. Furthermore, we propose a frame-level accent feature, which is extracted based on the proposed accent identification model and can be dynamically adjusted. We pre-train our models using 960 hours unlabeled LibriSpeech dataset and fine-tune them on AESRC2020 speech dataset. The experimental results show that our proposed accent-dependent ASR system is significantly ahead of the AESRC2020 baseline and achieves 6.5% relative word error rate (WER) reduction compared with our accent-independent ASR system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/06/2022

Can Self-Supervised Learning solve the problem of child speech recognition?

Despite recent advancements in deep learning technologies, Child Speech ...
research
06/30/2022

FeaRLESS: Feature Refinement Loss for Ensembling Self-Supervised Learning Features in Robust End-to-end Speech Recognition

Self-supervised learning representations (SSLR) have resulted in robust ...
research
11/04/2022

Biased Self-supervised learning for ASR

Self-supervised learning via masked prediction pre-training (MPPT) has s...
research
10/09/2021

Wav2vec-S: Semi-Supervised Pre-Training for Speech Recognition

Self-supervised pre-training has dramatically improved the performance o...
research
12/14/2021

Improving Hybrid CTC/Attention End-to-end Speech Recognition with Pretrained Acoustic and Language Model

Recently, self-supervised pretraining has achieved impressive results in...
research
10/26/2022

UFO2: A unified pre-training framework for online and offline speech recognition

In this paper, we propose a Unified pre-training Framework for Online an...
research
02/18/2023

Front-End Adapter: Adapting Front-End Input of Speech based Self-Supervised Learning for Speech Recognition

Recent years have witnessed a boom in self-supervised learning (SSL) in ...

Please sign up or login with your details

Forgot password? Click here to reset