This paper introduces an end-to-end neural speech restoration model,
HD-...
Multi-lingual speech recognition aims to distinguish linguistic expressi...
The goal of this work is zero-shot text-to-speech synthesis, with speaki...
We propose DiffSep, a new single channel source separation method based ...
In this paper, we propose a novel end-to-end user-defined keyword spotti...
Deep learning has brought impressive progress in the study of both autom...
Modern neural speech enhancement models usually include various forms of...
ASV (automatic speaker verification) systems are intrinsically required ...
In this work, we present a novel audio-visual dataset for active speaker...
In this paper, we address the problem of separating individual speech si...
Many approaches can derive information about a single speaker's identity...
In this paper, we propose an effective training strategy to ex-tract rob...
The goal of this work is to synchronise audio and video of a talking fac...
The objective of this paper is to separate a target speaker's speech fro...
The goal of this work is to train discriminative cross-modal embeddings
...
Deep clustering is a deep neural network-based speech separation algorit...
This paper proposes a new strategy for learning powerful cross-modal
emb...