Towards Generalized Speech Enhancement with Generative Adversarial Networks

04/06/2019
by   Santiago Pascual, et al.
0

The speech enhancement task usually consists of removing additive noise or reverberation that partially mask spoken utterances, affecting their intelligibility. However, little attention is drawn to other, perhaps more aggressive signal distortions like clipping, chunk elimination, or frequency-band removal. Such distortions can have a large impact not only on intelligibility, but also on naturalness or even speaker identity, and require of careful signal reconstruction. In this work, we give full consideration to this generalized speech enhancement task, and show it can be tackled with a time-domain generative adversarial network (GAN). In particular, we extend a previous GAN-based speech enhancement system to deal with mixtures of four types of aggressive distortions. Firstly, we propose the addition of an adversarial acoustic regression loss that promotes a richer feature extraction at the discriminator. Secondly, we also make use of a two-step adversarial training schedule, acting as a warm up-and-fine-tune sequence. Both objective and subjective evaluations show that these two additions bring improved speech reconstructions that better match the original speaker identity and naturalness.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/04/2021

VSEGAN: Visual Speech Enhancement Generative Adversarial Network

Speech enhancement is an essential task of improving speech quality in n...
research
07/27/2020

On the Use of Audio Fingerprinting Features for Speech Enhancement with Generative Adversarial Network

The advent of learning-based methods in speech enhancement has revived t...
research
03/30/2021

Time-domain Speech Enhancement with Generative Adversarial Learning

Speech enhancement aims to obtain speech signals with high intelligibili...
research
09/13/2019

Spoken Speech Enhancement using EEG

In this paper we demonstrate spoken speech enhancement using electroence...
research
06/16/2021

A Flow-Based Neural Network for Time Domain Speech Enhancement

Speech enhancement involves the distinction of a target speech signal fr...
research
06/13/2020

SE-MelGAN – Speaker Agnostic Rapid Speech Enhancement

Recent advancement in Generative Adversarial Networks in speech synthesi...
research
01/31/2021

High Fidelity Speech Regeneration with Application to Speech Enhancement

Speech enhancement has seen great improvement in recent years mainly thr...

Please sign up or login with your details

Forgot password? Click here to reset