Can large-scale vocoded spoofed data improve speech spoofing countermeasure with a self-supervised front end?

09/12/2023
by   Xin Wang, et al.
1

A speech spoofing countermeasure (CM) that discriminates between unseen spoofed and bona fide data requires diverse training data. While many datasets use spoofed data generated by speech synthesis systems, it was recently found that data vocoded by neural vocoders were also effective as the spoofed training data. Since many neural vocoders are fast in building and generation, this study used multiple neural vocoders and created more than 9,000 hours of vocoded data on the basis of the VoxCeleb2 corpus. This study investigates how this large-scale vocoded data can improve spoofing countermeasures that use data-hungry self-supervised learning (SSL) models. Experiments demonstrated that the overall CM performance on multiple test sets improved when using features extracted by an SSL model continually trained on the vocoded data. Further improvement was observed when using a new SSL distilled from the two SSLs before and after the continual training. The CM with the distilled SSL outperformed the previous best model on challenging unseen test sets, including the ASVspoof 2019 logical access, WaveFake, and In-the-Wild.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/15/2021

Investigating self-supervised front ends for speech spoofing countermeasures

Self-supervised speech model is a rapid progressing research topic, and ...
research
10/19/2022

Spoofed training data for speech spoofing countermeasure can be efficiently created using neural vocoders

A good training set for speech spoofing countermeasures requires diverse...
research
05/21/2023

Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus

We present a large-scale in-the-wild Japanese laughter corpus and a laug...
research
06/15/2021

Spoofing Generalization: When Can't You Trust Proprietary Models?

In this work, we study the computational complexity of determining wheth...
research
02/24/2022

Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation

The performance of spoofing countermeasure systems depends fundamentally...
research
05/24/2023

Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model

Large-scale pretrained models using self-supervised learning have report...
research
09/18/2023

Spoofing attack augmentation: can differently-trained attack models improve generalisation?

A reliable deepfake detector or spoofing countermeasure (CM) should be r...

Please sign up or login with your details

Forgot password? Click here to reset