Towards Understanding and Mitigating Audio Adversarial Examples for Speaker Recognition

by   Guangke Chen, et al.

Speaker recognition systems (SRSs) have recently been shown to be vulnerable to adversarial attacks, raising significant security concerns. In this work, we systematically investigate transformation and adversarial training based defenses for securing SRSs. According to the characteristic of SRSs, we present 22 diverse transformations and thoroughly evaluate them using 7 recent promising adversarial attacks (4 white-box and 3 black-box) on speaker recognition. With careful regard for best practices in defense evaluations, we analyze the strength of transformations to withstand adaptive attacks. We also evaluate and understand their effectiveness against adaptive attacks when combined with adversarial training. Our study provides lots of useful insights and findings, many of them are new or inconsistent with the conclusions in the image and speech recognition domains, e.g., variable and constant bit rate speech compressions have different performance, and some non-differentiable transformations remain effective against current promising evasion techniques which often work well in the image domain. We demonstrate that the proposed novel feature-level transformation combined with adversarial training is rather effective compared to the sole adversarial training in a complete white-box setting, e.g., increasing the accuracy by 13.62 of magnitude, while other transformations do not necessarily improve the overall defense capability. This work sheds further light on the research directions in this field. We also release our evaluation platform SPEAKERGUARD to foster further research.


WaveGuard: Understanding and Mitigating Audio Adversarial Examples

There has been a recent surge in adversarial attacks on deep learning ba...

Adversarial Attack and Defense Strategies for Deep Speaker Recognition Systems

Robust speaker recognition, including in the presence of malicious attac...

Who is Real Bob? Adversarial Attacks on Speaker Recognition Systems

Speaker recognition (SR) is widely used in our daily life as a biometric...

Defending against Adversarial Audio via Diffusion Model

Deep learning models have been widely used in commercial acoustic system...

SirenAttack: Generating Adversarial Audio for End-to-End Acoustic Systems

Despite their immense popularity, deep learning-based acoustic systems a...

Countering Adversarial Images using Input Transformations

This paper investigates strategies that defend against adversarial-examp...

Baseline Defenses for Adversarial Attacks Against Aligned Language Models

As Large Language Models quickly become ubiquitous, it becomes critical ...

Please sign up or login with your details

Forgot password? Click here to reset