Where to Begin? On the Impact of Pre-Training and Initialization in Federated Learning

by   John Nguyen, et al.

An oft-cited challenge of federated learning is the presence of heterogeneity. Data heterogeneity refers to the fact that data from different clients may follow very different distributions. System heterogeneity refers to the fact that client devices have different system capabilities. A considerable number of federated optimization methods address this challenge. In the literature, empirical evaluations usually start federated training from random initialization. However, in many practical applications of federated learning, the server has access to proxy data for the training task that can be used to pre-train a model before starting federated training. We empirically study the impact of starting from a pre-trained model in federated learning using four standard federated learning benchmark datasets. Unsurprisingly, starting from a pre-trained model reduces the training time required to reach a target error rate and enables the training of more accurate models (up to 40%) than is possible when starting from random initialization. Surprisingly, we also find that starting federated learning from a pre-trained initialization reduces the effect of both data and system heterogeneity. We recommend that future work proposing and evaluating federated optimization methods evaluate the performance when starting from random and pre-trained initializations. We also believe this study raises several questions for further work on understanding the role of heterogeneity in federated optimization.


page 1

page 2

page 3

page 4


Where to Begin? Exploring the Impact of Pre-Training and Initialization in Federated Learning

An oft-cited challenge of federated learning is the presence of data het...

Adaptive Federated Optimization

Federated learning is a distributed machine learning paradigm in which a...

An Empirical Evaluation of Federated Contextual Bandit Algorithms

As the adoption of federated learning increases for learning from sensit...

Federated Learning Over Images: Vertical Decompositions and Pre-Trained Backbones Are Difficult to Beat

We carefully evaluate a number of algorithms for learning in a federated...

Federated Learning for ASR based on Wav2vec 2.0

This paper presents a study on the use of federated learning to train an...

On Noisy Evaluation in Federated Hyperparameter Tuning

Hyperparameter tuning is critical to the success of federated learning a...

Federated Inference with Reliable Uncertainty Quantification over Wireless Channels via Conformal Prediction

Consider a setting in which devices and a server share a pre-trained mod...

Please sign up or login with your details

Forgot password? Click here to reset