Attribute Inference Attack of Speech Emotion Recognition in Federated Learning Settings

by   Tiantian Feng, et al.

Speech emotion recognition (SER) processes speech signals to detect and characterize expressed perceived emotions. Many SER application systems often acquire and transmit speech data collected at the client-side to remote cloud platforms for inference and decision making. However, speech data carry rich information not only about emotions conveyed in vocal expressions, but also other sensitive demographic traits such as gender, age and language background. Consequently, it is desirable for SER systems to have the ability to classify emotion constructs while preventing unintended/improper inferences of sensitive and demographic information. Federated learning (FL) is a distributed machine learning paradigm that coordinates clients to train a model collaboratively without sharing their local data. This training approach appears secure and can improve privacy for SER. However, recent works have demonstrated that FL approaches are still vulnerable to various privacy attacks like reconstruction attacks and membership inference attacks. Although most of these have focused on computer vision applications, such information leakages exist in the SER systems trained using the FL technique. To assess the information leakage of SER systems trained using FL, we propose an attribute inference attack framework that infers sensitive attribute information of the clients from shared gradients or model parameters, corresponding to the FedSGD and the FedAvg training algorithms, respectively. As a use case, we empirically evaluate our approach for predicting the client's gender information using three SER benchmark datasets: IEMOCAP, CREMA-D, and MSP-Improv. We show that the attribute inference attack is achievable for SER systems trained using FL. We further identify that most information leakage possibly comes from the first layer in the SER model.


page 3

page 4

page 5


Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-Labeling

Speech Emotion Recognition (SER) application is frequently associated wi...

User-Level Differential Privacy against Attribute Inference Attack of Speech Emotion Recognition in Federated Learning

Many existing privacy-enhanced speech emotion recognition (SER) framewor...

Source Inference Attacks in Federated Learning

Federated learning (FL) has emerged as a promising privacy-aware paradig...

Privacy-preserving Speech Emotion Recognition through Semi-Supervised Federated Learning

Speech Emotion Recognition (SER) refers to the recognition of human emot...

RecUP-FL: Reconciling Utility and Privacy in Federated Learning via User-configurable Privacy Defense

Federated learning (FL) provides a variety of privacy advantages by allo...

Eavesdrop the Composition Proportion of Training Labels in Federated Learning

Federated learning (FL) has recently emerged as a new form of collaborat...

An Attribute-Aligned Strategy for Learning Speech Representation

Advancement in speech technology has brought convenience to our life. Ho...

Please sign up or login with your details

Forgot password? Click here to reset