We propose a novel framework for electrolaryngeal speech intelligibility...
This study introduces a novel training paradigm, audio difference learni...
Non-autoregressive (non-AR) sequence-to-seqeunce (seq2seq) models for vo...
Foreign accent conversion (FAC) is a special application of voice conver...
We present the latest iteration of the voice conversion challenge (VCC)
...
Deaf or hard-of-hearing (DHH) speakers typically have atypical speech ca...
Text-to-speech synthesis (TTS) is a task to convert texts into speech. T...
End-to-end text-to-speech synthesis (TTS) can generate highly natural
sy...
The criteria for measuring music similarity are important for developing...
Deep neural network (DNN)-based speech enhancement usually uses a clean
...
Research on automatic speech recognition (ASR) systems for electrolaryng...
This paper describes the design of NNSVS, an open-source software for ne...
Our previous work, the unified source-filter GAN (uSFGAN) vocoder, intro...
Sequence-to-sequence (seq2seq) voice conversion (VC) models have greater...
Neural-based text-to-speech (TTS) systems achieve very high-fidelity spe...
We present a large-scale comparative study of self-supervised speech
rep...
This paper presents a new voice conversion (VC) framework capable of dea...
Anomalous sound detection systems must detect unknown, atypical sounds u...
This paper introduces a unified source-filter network with a
harmonic-pl...
We investigate the performance of self-supervised pretraining frameworks...
We present the first edition of the VoiceMOS Challenge, a scientific eve...
Beyond the conventional voice conversion (VC) where the speaker informat...
Without the need of a clean reference, non-intrusive speech assessment
m...
An effective approach to automatically predict the subjective rating for...
We present a voice conversion framework that converts normal speech into...
This paper introduces S3PRL-VC, an open-source voice conversion (VC)
fra...
In a conventional voice conversion (VC) framework, a VC model is often
t...
Voice conversion (VC) is an effective approach to electrolaryngeal (EL)
...
In voice conversion (VC), an approach showing promising results in the l...
An anomalous sound detection system to detect unknown anomalous sounds
u...
We propose a new paradigm for maintaining speaker identity in dysarthric...
This paper presents a low-latency real-time (LLRT) non-parallel voice
co...
This paper presents a novel high-fidelity and low-latency universal neur...
This paper proposes a novel voice conversion (VC) method based on
non-au...
We propose a unified approach to data-driven source-filter modeling usin...
This paper describes the AS-NU systems for two tracks in MultiSpeaker
Mu...
In this paper, we present an open-source software for developing a
nonpa...
We propose a simple method for automatic speech recognition (ASR) by
fin...
We present a novel approach to any-to-one (A2O) voice conversion (VC) in...
In this paper, we present the voice conversion (VC) systems developed at...
In this paper, we present a description of the baseline system of Voice
...
This paper presents the sequence-to-sequence (seq2seq) baseline system f...
The Voice Conversion Challenge 2020 is the third edition under its flags...
The voice conversion challenge is a bi-annual scientific event held to
c...
Sequence-to-sequence (seq2seq) voice conversion (VC) models are attracti...
In this paper, we propose a quasi-periodic parallel WaveGAN (QPPWG) wave...
In this paper, a pitch-adaptive waveform generative model named
Quasi-Pe...
Recently, the effectiveness of text-to-speech (TTS) systems combined wit...
In this paper, we propose a parallel WaveGAN (PWG)-like neural vocoder w...
This paper proposes a voice conversion (VC) method based on a
sequence-t...