Local-global speaker representation for target speaker extraction

10/28/2022
by   Shulin He, et al.
0

Target speaker extraction is to extract the target speaker's voice from a mixture of signals according to the given enrollment utterance. The target speaker's enrollment utterance is also called as anchor speech. The effective utilization of anchor speech is crucial for speaker extraction. In this study, we propose a new system to exploit speaker information from anchor speech fully. Unlike models that use only local or global features of the anchor, the proposed method extracts speaker information on global and local levels and feeds the features into a speech separation network. Our approach benefits from the complementary advantages of both global and local features, and the performance of speaker extraction is improved. We verified the feasibility of this local-global representation (LGR) method using multiple speaker extraction models. Systematic experiments were conducted on the open-source dataset Libri-2talker, and the results showed that the proposed method significantly outperformed the baseline models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/21/2022

L-SpEx: Localized Target Speaker Extraction

Speaker extraction aims to extract the target speaker's voice from a mul...
research
10/24/2020

Unsupervised Learning of Disentangled Speech Content and Style Representation

We present an approach for unsupervised learning of speech representatio...
research
07/26/2020

A Dual Iterative Refinement Method for Non-rigid Shape Matching

In this work, a simple and efficient dual iterative refinement (DIR) met...
research
04/11/2022

Listen only to me! How well can target speech extraction handle false alarms?

Target speech extraction (TSE) extracts the speech of a target speaker i...
research
07/24/2018

Deep Extractor Network for Target Speaker Recovery From Single Channel Speech Mixtures

Speaker-aware source separation methods are promising workarounds for ma...
research
02/07/2021

Time-Domain Speech Extraction with Spatial Information and Multi Speaker Conditioning Mechanism

In this paper, we present a novel multi-channel speech extraction system...
research
06/16/2022

Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations

Target speech extraction is a technique to extract the target speaker's ...

Please sign up or login with your details

Forgot password? Click here to reset