We present a noisy channel generative model of two sequences, for exampl...
This work explores the task of synthesizing speech in nonexistent
human-...
Non-saturating generative adversarial network (GAN) training is widely u...
Despite the ability to produce human-level speech for in-domain text,
at...
We present a novel generative model that combines state-of-the-art neura...
Recent work has explored sequence-to-sequence latent variable models for...
Global Style Tokens (GSTs) are a recently-proposed method to learn laten...
We present an extension to the Tacotron speech synthesis architecture th...
In this work, we propose "global style tokens" (GSTs), a bank of embeddi...
Prosodic modeling is a core problem in speech synthesis. The key challen...
A text-to-speech synthesis system typically consists of multiple stages,...