Neural text-to-speech systems are often optimized on L1/L2 losses, which...
In this paper, we propose GlowVC: a multilingual multi-speaker flow-base...
Duration modelling has become an important research problem once more wi...
Non-parallel voice conversion (VC) is typically achieved using lossy
rep...
Whilst recent neural text-to-speech (TTS) approaches produce high-qualit...
While recent neural text-to-speech (TTS) systems perform remarkably well...
Neural text-to-speech synthesis (NTTS) models have shown significant pro...
Recent speech synthesis systems based on sampling from autoregressive ne...
Statistical TTS systems that directly predict the speech waveform have
r...
This paper introduces a robust universal neural vocoder trained with 74
...
Output from statistical parametric speech synthesis (SPSS) remains notic...