Where to Begin? Exploring the Impact of Pre-Training and Initialization in Federated Learning

by   John Nguyen, et al.

An oft-cited challenge of federated learning is the presence of data heterogeneity – the data at different clients may follow very different distributions. Several federated optimization methods have been proposed to address these challenges. In the literature, empirical evaluations usually start federated training from a random initialization. However, in many practical applications of federated learning, the server has access to proxy data for the training task which can be used to pre-train a model before starting federated training. We empirically study the impact of starting from a pre-trained model in federated learning using four common federated learning benchmark datasets. Unsurprisingly, starting from a pre-trained model reduces the training time required to reach a target error rate and enables training more accurate models (by up to 40%) than is possible than when starting from a random initialization. Surprisingly, we also find that the effect of data heterogeneity is much less significant when starting federated training from a pre-trained initialization. Rather, when starting from a pre-trained model, using an adaptive optimizer at the server, such as FedAdam, consistently leads to the best accuracy. We recommend that future work proposing and evaluating federated optimization methods consider the performance when starting both random and pre-trained initializations. We also believe this study raises several questions for further work on understanding the role of heterogeneity in federated optimization.


page 1

page 2

page 3

page 4


Where to Begin? On the Impact of Pre-Training and Initialization in Federated Learning

An oft-cited challenge of federated learning is the presence of heteroge...

Federated Learning Over Images: Vertical Decompositions and Pre-Trained Backbones Are Difficult to Beat

We carefully evaluate a number of algorithms for learning in a federated...

Federated Inference with Reliable Uncertainty Quantification over Wireless Channels via Conformal Prediction

Consider a setting in which devices and a server share a pre-trained mod...

An Empirical Evaluation of Federated Contextual Bandit Algorithms

As the adoption of federated learning increases for learning from sensit...

On Pre-Training for Federated Learning

In most of the literature on federated learning (FL), neural networks ar...

Federated Learning for ASR based on Wav2vec 2.0

This paper presents a study on the use of federated learning to train an...

Federated Learning Framework Coping with Hierarchical Heterogeneity in Cooperative ITS

In this paper, we introduce a federated learning framework coping with H...

Please sign up or login with your details

Forgot password? Click here to reset