DeepAI AI Chat
Log In Sign Up

Differentially Private Federated Learning on Heterogeneous Data

by   Maxence Noble, et al.

Federated Learning (FL) is a paradigm for large-scale distributed learning which faces two key challenges: (i) efficient training from highly heterogeneous user data, and (ii) protecting the privacy of participating users. In this work, we propose a novel FL approach (DP-SCAFFOLD) to tackle these two challenges together by incorporating Differential Privacy (DP) constraints into the popular SCAFFOLD algorithm. We focus on the challenging setting where users communicate with a ”honest-but-curious” server without any trusted intermediary, which requires to ensure privacy not only towards a third-party with access to the final model but also towards the server who observes all user communications. Using advanced results from DP theory, we establish the convergence of our algorithm for convex and non-convex objectives. Our analysis clearly highlights the privacy-utility trade-off under data heterogeneity, and demonstrates the superiority of DP-SCAFFOLD over the state-of-the-art algorithm DP-FedAvg when the number of local updates and the level of heterogeneity grow. Our numerical results confirm our analysis and show that DP-SCAFFOLD provides significant gains in practice.


page 1

page 2

page 3

page 4


OLIVE: Oblivious and Differentially Private Federated Learning on Trusted Execution Environment

Differentially private federated learning (DP-FL) has received increasin...

Fast-adapting and Privacy-preserving Federated Recommender System

In the mobile Internet era, recommender systems have become an irreplace...

Private Non-Convex Federated Learning Without a Trusted Server

We study differentially private (DP) federated learning (FL) with non-co...

Private Cross-Silo Federated Learning for Extracting Vaccine Adverse Event Mentions

Federated Learning (FL) is quickly becoming a goto distributed training ...

Differentially Private Federated Learning for Cancer Prediction

Since 2014, the NIH funded iDASH (integrating Data for Analysis, Anonymi...

Privacy Amplification by Decentralization

Analyzing data owned by several parties while achieving a good trade-off...

Training Production Language Models without Memorizing User Data

This paper presents the first consumer-scale next-word prediction (NWP) ...