Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention

02/14/2020
by   Yuma Koizumi, et al.
0

This paper investigates a self-adaptation method for speech enhancement using auxiliary speaker-aware features; we extract a speaker representation used for adaptation directly from the test utterance. Conventional studies of deep neural network (DNN)–based speech enhancement mainly focus on building a speaker independent model. Meanwhile, in speech applications including speech recognition and synthesis, it is known that model adaptation to the target speaker improves the accuracy. Our research question is whether a DNN for speech enhancement can be adopted to unknown speakers without any auxiliary guidance signal in test-phase. To achieve this, we adopt multi-task learning of speech enhancement and speaker identification, and use the output of the final hidden layer of speaker identification branch as an auxiliary feature. In addition, we use multi-head self-attention for capturing long-term dependencies in the speech and noise. Experimental results on a public dataset show that our strategy achieves the state-of-the-art performance and also outperform conventional methods in terms of subjective quality.

READ FULL TEXT
research
05/15/2020

Speaker Re-identification with Speaker Dependent Speech Enhancement

While the use of deep neural networks has significantly boosted speaker ...
research
01/07/2021

Attention-based multi-task learning for speech-enhancement and speaker-identification in multi-speaker dialogue scenario

Multi-task learning (MTL) and attention mechanism have been proven to ef...
research
05/30/2022

Personalized Acoustic Echo Cancellation for Full-duplex Communications

Deep neural networks (DNNs) have shown promising results for acoustic ec...
research
11/14/2022

Multi-Label Training for Text-Independent Speaker Identification

In this paper, we propose a novel strategy for text-independent speaker ...
research
05/10/2020

Cognitive-driven convolutional beamforming using EEG-based auditory attention decoding

The performance of speech enhancement algorithms in a multi-speaker scen...
research
10/17/2022

How to Leverage DNN-based speech enhancement for multi-channel speaker verification?

Speaker verification (SV) suffers from unsatisfactory performance in far...
research
11/03/2022

Dynamic Kernels and Channel Attention with Multi-Layer Embedding Aggregation for Speaker Verification

State-of-the-art speaker verification frameworks have typically focused ...

Please sign up or login with your details

Forgot password? Click here to reset