Pushing the limits of self-supervised speaker verification using regularized distillation framework

11/08/2022
by   Yafeng Chen, et al.
0

Training robust speaker verification systems without speaker labels has long been a challenging task. Previous studies observed a large performance gap between self-supervised and fully supervised methods. In this paper, we apply a non-contrastive self-supervised learning framework called DIstillation with NO labels (DINO) and propose two regularization terms applied to embeddings in DINO. One regularization term guarantees the diversity of the embeddings, while the other regularization term decorrelates the variables of each embedding. The effectiveness of various data augmentation techniques are explored, on both time and frequency domain. A range of experiments conducted on the VoxCeleb datasets demonstrate the superiority of the regularized DINO framework in speaker verification. Our method achieves the state-of-the-art speaker verification performance under a single-stage self-supervised setting on VoxCeleb. The codes will be made publicly-available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/05/2023

Self-Distillation Network with Ensemble Prototypes: Learning Robust Speaker Representations without Supervision

Training speaker-discriminative and robust speaker verification systems ...
research
10/28/2022

A comprehensive study on self-supervised distillation for speaker representation learning

In real application scenarios, it is often challenging to obtain a large...
research
04/12/2023

Self-Supervised Learning with Cluster-Aware-DINO for High-Performance Robust Speaker Verification

Automatic speaker verification task has made great achievements using de...
research
06/16/2023

Evaluation of Speech Representations for MOS prediction

In this paper, we evaluate feature extraction models for predicting spee...
research
08/15/2022

C3-DINO: Joint Contrastive and Non-contrastive Self-Supervised Learning for Speaker Verification

Self-supervised learning (SSL) has drawn an increased attention in the f...
research
08/03/2022

Self-Supervised Speaker Verification Using Dynamic Loss-Gate and Label Correction

For self-supervised speaker verification, the quality of pseudo labels d...
research
02/17/2021

S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration

Previous studies dominantly target at self-supervised learning on real-v...

Please sign up or login with your details

Forgot password? Click here to reset