Continuous Speech Separation with Recurrent Selective Attention Network

10/28/2021
by   Yixuan Zhang, et al.
0

While permutation invariant training (PIT) based continuous speech separation (CSS) significantly improves the conversation transcription accuracy, it often suffers from speech leakages and failures in separation at "hot spot" regions because it has a fixed number of output channels. In this paper, we propose to apply recurrent selective attention network (RSAN) to CSS, which generates a variable number of output channels based on active speaker counting. In addition, we propose a novel block-wise dependency extension of RSAN by introducing dependencies between adjacent processing blocks in the CSS framework. It enables the network to utilize the separation results from the previous blocks to facilitate the current block processing. Experimental results on the LibriCSS dataset show that the RSAN-based CSS (RSAN-CSS) network consistently improves the speech recognition accuracy over PIT-based models. The proposed block-wise dependency modeling further boosts the performance of RSAN-CSS.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/21/2019

All-neural online source separation, counting, and diarization for meeting analysis

Automatic meeting analysis comprises the tasks of speaker counting, spea...
research
08/13/2020

Continuous Speech Separation with Conformer

Continuous speech separation plays a vital role in complicated speech re...
research
04/13/2019

Low-Latency Speaker-Independent Continuous Speech Separation

Speaker independent continuous speech separation (SI-CSS) is a task of c...
research
07/30/2021

Graph-PIT: Generalized permutation invariant training for continuous separation of arbitrary numbers of speakers

Automatic transcription of meetings requires handling of overlapped spee...
research
10/27/2021

Separating Long-Form Speech with Group-Wise Permutation Invariant Training

Multi-talker conversational speech processing has drawn many interests f...
research
10/30/2019

End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation

An important problem in ad-hoc microphone speech separation is how to gu...
research
10/23/2020

Don't shoot butterfly with rifles: Multi-channel Continuous Speech Separation with Early Exit Transformer

With its strong modeling capacity that comes from a multi-head and multi...

Please sign up or login with your details

Forgot password? Click here to reset