In this paper, we explored how to boost speech emotion recognition (SER)...
Although diffusion models in text-to-speech have become a popular choice...
Although high-fidelity speech can be obtained for intralingual speech
sy...
The utilization of discrete speech tokens, divided into semantic tokens ...
In this paper, we describe the systems developed by the SJTU X-LANCE tea...
In this work, we present DiffVoice, a novel text-to-speech model based o...
Although current neural text-to-speech (TTS) models are able to generate...
The mainstream neural text-to-speech(TTS) pipeline is a cascade system,
...
Although word-level prosody modeling in neural text-to-speech (TTS) has ...
Popular node embedding methods such as DeepWalk follow the paradigm of
p...