Unconditional Audio Generation with Generative Adversarial Networks and Cycle Regularization

05/18/2020
by   Jen-Yu Liu, et al.
0

In a recent paper, we have presented a generative adversarial network (GAN)-based model for unconditional generation of the mel-spectrograms of singing voices. As the generator of the model is designed to take a variable-length sequence of noise vectors as input, it can generate mel-spectrograms of variable length. However, our previous listening test shows that the quality of the generated audio leaves room for improvement. The present paper extends and expands that previous work in the following aspects. First, we employ a hierarchical architecture in the generator to induce some structure in the temporal dimension. Second, we introduce a cycle regularization mechanism to the generator to avoid mode collapse. Third, we evaluate the performance of the new model not only for generating singing voices, but also for generating speech voices. Evaluation result shows that new model outperforms the prior one both objectively and subjectively. We also employ the model to unconditionally generate sequences of piano and violin music and find the result promising. Audio examples, as well as the code for implementing our model, will be publicly available online upon paper publication.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/31/2017

MidiNet: A Convolutional Generative Adversarial Network for Symbolic-domain Music Generation

Most existing neural network models for music generation use recurrent n...
research
08/02/2017

Controllable Generative Adversarial Network

Although it is recently introduced, in last few years, generative advers...
research
05/04/2021

VQCPC-GAN: Variable-length Adversarial Audio Synthesis using Vector-Quantized Contrastive Predictive Coding

Influenced by the field of Computer Vision, Generative Adversarial Netwo...
research
10/13/2021

Automatic DJ Transitions with Differentiable Audio Effects and Generative Adversarial Networks

A central task of a Disc Jockey (DJ) is to create a mixset of mu-sic wit...
research
09/25/2019

High Fidelity Speech Synthesis with Adversarial Networks

Generative adversarial networks have seen rapid development in recent ye...
research
04/02/2022

StyleWaveGAN: Style-based synthesis of drum sounds with extensive controls using generative adversarial networks

In this paper we introduce StyleWaveGAN, a style-based drum sound genera...
research
08/02/2019

Cycle In Cycle Generative Adversarial Networks for Keypoint-Guided Image Generation

In this work, we propose a novel Cycle In Cycle Generative Adversarial N...

Please sign up or login with your details

Forgot password? Click here to reset