Face swapping aims at injecting a source image's identity (i.e., facial
...
In recent years, emotional text-to-speech has shown considerable progres...
Expressive text-to-speech has shown improved performance in recent years...
Recent developments in deep learning have significantly improved the qua...
This paper describes a fast speaker search system to retrieve segments o...
Although there are more than 65,000 languages in the world, the
pronunci...
We propose prosody embeddings for emotional and expressive speech synthe...
We propose a neural text-to-speech (TTS) model that can imitate a new
sp...
In this paper, we introduce an emotional speech synthesizer based on the...