SE-Bridge: Speech Enhancement with Consistent Brownian Bridge

05/23/2023
by   Zhibin Qiu, et al.
0

We propose SE-Bridge, a novel method for speech enhancement (SE). After recently applying the diffusion models to speech enhancement, we can achieve speech enhancement by solving a stochastic differential equation (SDE). Each SDE corresponds to a probabilistic flow ordinary differential equation (PF-ODE), and the trajectory of the PF-ODE solution consists of the speech states at different moments. Our approach is based on consistency model that ensure any speech states on the same PF-ODE trajectory, correspond to the same initial state. By integrating the Brownian Bridge process, the model is able to generate high-intelligibility speech samples without adversarial training. This is the first attempt that applies the consistency models to SE task, achieving state-of-the-art results in several metrics while saving 15 x the time required for sampling compared to the diffusion-based baseline. Our experiments on multiple datasets demonstrate the effectiveness of SE-Bridge in SE. Furthermore, we show through extensive experiments on downstream tasks, including Automatic Speech Recognition (ASR) and Speaker Verification (SV), that SE-Bridge can effectively support multiple downstream tasks.

READ FULL TEXT
research
09/15/2022

MVNet: Memory Assistance and Vocal Reinforcement Network for Speech Enhancement

Speech enhancement improves speech quality and promotes the performance ...
research
08/27/2021

Task-aware Warping Factors in Mask-based Speech Enhancement

This paper proposes the use of two task-aware warping factors in mask-ba...
research
05/24/2023

Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss

Self-supervised learning (SSL) is the latest breakthrough in speech proc...
research
07/19/2022

ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding

This paper presents recent progress on integrating speech separation and...
research
05/18/2023

Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders

Diffusion-based speech enhancement (SE) has been investigated recently, ...
research
06/14/2023

Variance-Preserving-Based Interpolation Diffusion Models for Speech Enhancement

The goal of this study is to implement diffusion models for speech enhan...
research
07/04/2021

TENET: A Time-reversal Enhancement Network for Noise-robust ASR

Due to the unprecedented breakthroughs brought about by deep learning, s...

Please sign up or login with your details

Forgot password? Click here to reset