In recent years, Large Language Models (LLMs) have garnered significant
...
End-to-end (E2E) spoken language understanding (SLU) systems that genera...
End-to-end multilingual ASR has become more appealing because of several...
We propose a novel deliberation-based approach to end-to-end (E2E) spoke...
Streaming ASR with strict latency constraints is required in many speech...
We propose Neural-FST Class Language Model (NFCLM) for end-to-end speech...
Measuring automatic speech recognition (ASR) system quality is critical ...
On-device speech recognition requires training models of different sizes...
Often, the storage and computational constraints of embeddeddevices dema...
As speech-enabled devices such as smartphones and smart speakers become
...
How to leverage dynamic contextual information in end-to-end speech
reco...
We propose a dynamic encoder transducer (DET) for on-device speech
recog...
Word Error Rate (WER) has been the predominant metric used to evaluate t...
Recurrent transducer models have emerged as a promising solution for spe...
End-to-end models in general, and Recurrent Neural Network Transducer (R...
There is a growing interest in the speech community in developing Recurr...
Attention-based models have been gaining popularity recently for their s...
Recurrent Neural Network Transducer (RNN-T), like most end-to-end speech...
Transformers, originally proposed for natural language processing (NLP)
...
As one of the major sources in speech variability, accents have posed a ...
Neural transducer-based systems such as RNN Transducers (RNN-T) for auto...
We explore options to use Transformer networks in neural transducer for
...
Grapheme-based acoustic modeling has recently been shown to outperform
p...
We propose and evaluate transformer-based acoustic models (AMs) for hybr...
There is an implicit assumption that traditional hybrid approaches for
a...
End-to-end modeling (E2E) of automatic speech recognition (ASR) blends a...
Achieving high accuracy with end-to-end speech recognizers requires care...
Building speech recognizers in multiple languages typically involves
rep...
High accuracy speech recognition requires a large amount of transcribed ...
Recent studies have shown that deep neural networks (DNNs) perform
signi...