Recently, end-to-end models have been widely used in automatic speech
re...
Our prior experiments show that humans and machines seem to employ diffe...
We propose an approach to extract speaker embeddings that are robust to
...
Self-supervised learning (SSL) in the pretraining stage using un-annotat...
In this paper, we explore automatic prediction of dialect density of the...
Children's automatic speech recognition (ASR) is always difficult due to...
This paper presents the results of a pilot study that introduces social
...
This paper proposes a novel linear prediction coding-based data aug-ment...
This paper describes the SPAPL system for the INTERSPEECH 2021 Challenge...
Non-autoregressive mechanisms can significantly decrease inference time ...
Automatic speech recognition (ASR) systems for young children are needed...
We present a bidirectional unsupervised model pre-training (UPT) method ...
Disfluencies are prevalent in spontaneous speech, as shown in many studi...
Does speaking style variation affect humans' ability to distinguish
indi...
The effects of speaking-style variability on automatic speaker verificat...
In this paper, we propose a novel way of addressing text-dependent autom...
The great majority of current voice technology applications relies on
ac...
This paper focuses on the problem of pitch tracking in noisy conditions....
Text-independent speaker recognition using short utterances is a highly
...