Streaming End-to-end Speech Recognition For Mobile Devices

11/15/2018
by   Yanzhang He, et al.
0

End-to-end (E2E) models, which directly predict output character sequences given input speech, are good candidates for on-device speech recognition. E2E models, however, present numerous challenges: In order to be truly useful, such models must decode speech utterances in a streaming fashion, in real time; they must be robust to the long tail of use cases; they must be able to leverage user-specific context (e.g., contact lists); and above all, they must be extremely accurate. In this work, we describe our efforts at building an E2E speech recognizer using a recurrent neural network transducer. In experimental evaluations, we find that the proposed approach can outperform a conventional CTC-based model in terms of both latency and accuracy in a number of evaluation categories.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/29/2019

Two-Pass End-to-End Speech Recognition

The requirements for many applications of state-of-the-art speech recogn...
research
01/24/2022

Endpoint Detection for Streaming End-to-End Multi-talker ASR

Streaming end-to-end multi-talker speech recognition aims at transcribin...
research
04/06/2021

Dissecting User-Perceived Latency of On-Device E2E Speech Recognition

As speech-enabled devices such as smartphones and smart speakers become ...
research
09/09/2020

VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition

We introduce VoiceFilter-Lite, a single-channel source separation model ...
research
04/09/2021

Language model fusion for streaming end to end speech recognition

Streaming processing of speech audio is required for many contemporary p...
research
04/05/2021

Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion

How to leverage dynamic contextual information in end-to-end speech reco...
research
10/25/2022

Streaming Parrotron for on-device speech-to-speech conversion

We present a fully on-device and streaming Speech-To-Speech (STS) conver...

Please sign up or login with your details

Forgot password? Click here to reset