This paper presents a method for selecting appropriate synthetic speech
...
Standard Recurrent Neural Network Transducers (RNN-T) decoding algorithm...
With 4.5 million hours of English speech from 10 different sources acros...
How to leverage dynamic contextual information in end-to-end speech
reco...
End-to-end models in general, and Recurrent Neural Network Transducer (R...
There is a growing interest in the speech community in developing Recurr...
End-to-end (E2E) systems for automatic speech recognition (ASR), such as...
N-HANS is a Python toolkit for in-the-wild audio enhancement, including
...
We present a novel source separation model to decompose asingle-channel
...
We address the problem of speech enhancement generalisation to unseen
en...
Ongoing developments in neural network models are continually advancing ...
We consider the task of weakly supervised one-shot detection. In this ta...
We consider neural network training, in applications in which there are ...
When humans learn a new concept, they might ignore examples that they ca...
Traditional convolutional layers extract features from patches of data b...