Speech Enhancement with Multi-granularity Vector Quantization

02/16/2023
by   Xiao-Ying Zhao, et al.
0

With advances in deep learning, neural network based speech enhancement (SE) has developed rapidly in the last decade. Meanwhile, the self-supervised pre-trained model and vector quantization (VQ) have achieved excellent performance on many speech-related tasks, while they are less explored on SE. As it was shown in our previous work that utilizing a VQ module to discretize noisy speech representations is beneficial for speech denoising, in this work we therefore study the impact of using VQ at different layers with different number of codebooks. Different VQ modules indeed enable to extract multiple-granularity speech features. Following an attention mechanism, the contextual features extracted by a pre-trained model are fused with the local features extracted by the encoder, such that both global and local information are preserved to reconstruct the enhanced speech. Experimental results on the Valentini dataset show that the proposed model can improve the SE performance, where the impact of choosing pre-trained models is also revealed.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/28/2022

Speech Enhancement Using Self-Supervised Pre-Trained Model and Vector Quantization

With the development of deep learning, neural network-based speech enhan...
research
08/28/2023

Rep2wav: Noise Robust text-to-speech Using self-supervised representations

Benefiting from the development of deep learning, text-to-speech (TTS) t...
research
12/07/2022

Selector-Enhancer: Learning Dynamic Selection of Local and Non-local Attention Operation for Speech Enhancement

Attention mechanisms, such as local and non-local attention, play a fund...
research
06/14/2023

Feature Normalization for Fine-tuning Self-Supervised Models in Speech Enhancement

Large, pre-trained representation models trained using self-supervised l...
research
05/31/2019

Increasing Compactness Of Deep Learning Based Speech Enhancement Models With Parameter Pruning And Quantization Techniques

Most recent studies on deep learning based speech enhancement (SE) focus...
research
11/04/2022

Self-Supervised Learning for Speech Enhancement through Synthesis

Modern speech enhancement (SE) networks typically implement noise suppre...
research
07/27/2019

Dilated FCN: Listening Longer to Hear Better

Deep neural network solutions have emerged as a new and powerful paradig...

Please sign up or login with your details

Forgot password? Click here to reset