End-to-End Speaker Diarization as Post-Processing

12/18/2020
by   Shota Horiguchi, et al.
0

This paper investigates the utilization of an end-to-end diarization model as post-processing of conventional clustering-based diarization. Clustering-based diarization methods partition frames into clusters of the number of speakers; thus, they typically cannot handle overlapping speech because each frame is assigned to one speaker. On the other hand, some end-to-end diarization methods can handle overlapping speech by treating the problem as multi-label classification. Although some methods can treat a flexible number of speakers, they do not perform well when the number of speakers is large. To compensate for each other's weakness, we propose to use a two-speaker end-to-end diarization method as post-processing of the results obtained by a clustering-based method. We iteratively select two speakers from the results and update the results of the two speakers to improve the overlapped region. Experimental results show that the proposed algorithm consistently improved the performance of the state-of-the-art methods across CALLHOME, AMI, and DIHARD II datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/21/2021

Online End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers

This paper proposes an online end-to-end diarization that can handle ove...
research
03/31/2022

EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers

In this paper, we present a novel framework that jointly performs speake...
research
07/04/2021

Towards Neural Diarization for Unlimited Numbers of Speakers Using Global and Local Attractors

Attractor-based end-to-end diarization is achieving comparable accuracy ...
research
09/14/2021

Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation

We propose to address online speaker diarization as a combination of inc...
research
03/18/2022

Speaker Embedding-aware Neural Diarization: an Efficient Framework for Overlapping Speech Diarization in Meeting Scenarios

Overlapping speech diarization has been traditionally treated as a multi...
research
05/22/2020

Identify Speakers in Cocktail Parties with End-to-End Attention

In scenarios where multiple speakers talk at the same time, it is import...
research
04/26/2022

Reformulating Speaker Diarization as Community Detection With Emphasis On Topological Structure

Clustering-based speaker diarization has stood firm as one of the major ...

Please sign up or login with your details

Forgot password? Click here to reset