Singing Voice Synthesis with Vibrato Modeling and Latent Energy Representation

11/02/2022
by   Yingjie Song, et al.
0

This paper proposes an expressive singing voice synthesis system by introducing explicit vibrato modeling and latent energy representation. Vibrato is essential to the naturalness of synthesized sound, due to the inherent characteristics of human singing. Hence, a deep learning-based vibrato model is introduced in this paper to control the vibrato's likeliness, rate, depth and phase in singing, where the vibrato likeliness represents the existence probability of vibrato and it would help improve the singing voice's naturalness. Actually, there is no annotated label about vibrato likeliness in existing singing corpus. We adopt a novel vibrato likeliness labeling method to label the vibrato likeliness automatically. Meanwhile, the power spectrogram of audio contains rich information that can improve the expressiveness of singing. An autoencoder-based latent energy bottleneck feature is proposed for expressive singing voice synthesis. Experimental results on the open dataset NUS48E show that both the vibrato modeling and the latent energy representation could significantly improve the expressiveness of singing voice. The audio samples are shown in the demo website.

READ FULL TEXT
research
08/31/2023

Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information

This paper presents an end-to-end high-quality singing voice synthesis (...
research
06/15/2021

MLP Singer: Towards Rapid Parallel Korean Singing Voice Synthesis

Recent developments in deep learning have significantly improved the qua...
research
06/12/2023

HiddenSinger: High-Quality Singing Voice Synthesis via Neural Audio Codec and Latent Diffusion Models

Recently, denoising diffusion models have demonstrated remarkable perfor...
research
09/06/2022

The Role of Voice Persona in Expressive Communication:An Argument for Relevance in Speech Synthesis Design

We present an approach to imbuing expressivity in a synthesized voice by...
research
09/01/2023

Enhancing the vocal range of single-speaker singing voice synthesis with melody-unsupervised pre-training

The single-speaker singing voice synthesis (SVS) usually underperforms a...
research
06/29/2023

Singing Voice Synthesis Using Differentiable LPC and Glottal-Flow-Inspired Wavetables

This paper introduces GlOttal-flow LPC Filter (GOLF), a novel method for...
research
03/21/2022

WeSinger: Data-augmented Singing Voice Synthesis with Auxiliary Losses

In this paper, we develop a new multi-singer Chinese neural singing voic...

Please sign up or login with your details

Forgot password? Click here to reset