Attention-based multi-channel speaker verification with ad-hoc microphone arrays

07/01/2021
by   Chengdong Liang, et al.
0

Recently, ad-hoc microphone array has been widely studied. Unlike traditional microphone array settings, the spatial arrangement and number of microphones of ad-hoc microphone arrays are not known in advance, which hinders the adaptation of traditional speaker verification technologies to ad-hoc microphone arrays. To overcome this weakness, in this paper, we propose attention-based multi-channel speaker verification with ad-hoc microphone arrays. Specifically, we add an inter-channel processing layer and a global fusion layer after the pooling layer of a single-channel speaker verification system. The inter-channel processing layer applies a so-called residual self-attention along the channel dimension for allocating weights to different microphones. The global fusion layer integrates all channels in a way that is independent to the number of the input channels. We further replace the softmax operator in the residual self-attention with sparsemax, which forces the channel weights of very noisy channels to zero. Experimental results with ad-hoc microphone arrays of over 30 channels demonstrate the effectiveness of the proposed methods. For example, the multi-channel speaker verification with sparsemax achieves an equal error rate (EER) of over 20 semi-real data sets, and over 30 scenarios with both matched and mismatched channel numbers.

READ FULL TEXT
research
10/12/2021

Frame-level multi-channel speaker verification with large-scale ad-hoc microphone arrays

Ad-hoc microphone arrays has recieved attention, in which the number and...
research
03/29/2021

Scaling sparsemax based channel selection for speech recognition with ad-hoc microphone arrays

Recently, speech recognition with ad-hoc microphone arrays has received ...
research
07/03/2023

Spatial-temporal Graph Based Multi-channel Speaker Verification With Ad-hoc Microphone Arrays

The performance of speaker verification degrades significantly in advers...
research
10/19/2022

Deep Learning Based Two-dimensional Speaker Localization With Large Ad-hoc Microphone Arrays

Deep learning based speaker localization has shown its advantage in reve...
research
10/16/2022

End-to-end Two-dimensional Sound Source Localization With Ad-hoc Microphone Arrays

Conventional sound source localization methods are mostly based on a sin...
research
06/25/2020

Will Dynamic Arrays finally change the way Models are built?

Spreadsheets offer a supremely successful and intuitive means of process...
research
08/31/2023

Excel as a Turing-complete Functional Programming Environment

Since the calculation engine of Excel was the subject of a major upgrade...

Please sign up or login with your details

Forgot password? Click here to reset