LeVoice ASR Systems for the ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge

10/14/2022
by   Yan Jia, et al.
0

This paper describes LeVoice automatic speech recognition systems to track2 of intelligent cockpit speech recognition challenge 2022. Track2 is a speech recognition task without limits on the scope of model size. Our main points include deep learning based speech enhancement, text-to-speech based speech generation, training data augmentation via various techniques and speech recognition model fusion. We compared and fused the hybrid architecture and two kinds of end-to-end architecture. For end-to-end modeling, we used models based on connectionist temporal classification/attention-based encoder-decoder architecture and recurrent neural network transducer/attention-based encoder-decoder architecture. The performance of these models is evaluated with an additional language model to improve word error rates. As a result, our system achieved 10.2% character error rate on the challenge test set data and ranked third place among the submitted systems in the challenge.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2017

Advances in Joint CTC-Attention based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM

We present a state-of-the-art end-to-end Automatic Speech Recognition (A...
research
12/14/2020

A review of on-device fully neural end-to-end automatic speech recognition algorithms

In this paper, we review various end-to-end automatic speech recognition...
research
03/14/2017

Multichannel End-to-end Speech Recognition

The field of speech recognition is in the midst of a paradigm shift: end...
research
09/16/2023

Decoder-only Architecture for Speech Recognition with CTC Prompts and Text Data Augmentation

Collecting audio-text pairs is expensive; however, it is much easier to ...
research
01/29/2021

BCN2BRNO: ASR System Fusion for Albayzin 2020 Speech to Text Challenge

This paper describes joint effort of BUT and Telefónica Research on deve...
research
10/27/2020

Multitask Training with Text Data for End-to-End Speech Recognition

We propose a multitask training method for attention-based end-to-end sp...
research
02/22/2022

Improving CTC-based speech recognition via knowledge transferring from pre-trained language models

Recently, end-to-end automatic speech recognition models based on connec...

Please sign up or login with your details

Forgot password? Click here to reset