Efficient Device Scheduling with Multi-Job Federated Learning

by   Chendi Zhou, et al.

Recent years have witnessed a large amount of decentralized data in multiple (edge) devices of end-users, while the aggregation of the decentralized data remains difficult for machine learning jobs due to laws or regulations. Federated Learning (FL) emerges as an effective approach to handling decentralized data without sharing the sensitive raw data, while collaboratively training global machine learning models. The servers in FL need to select (and schedule) devices during the training process. However, the scheduling of devices for multiple jobs with FL remains a critical and open problem. In this paper, we propose a novel multi-job FL framework to enable the parallel training process of multiple jobs. The framework consists of a system model and two scheduling methods. In the system model, we propose a parallel training process of multiple jobs, and construct a cost model based on the training time and the data fairness of various devices during the training process of diverse jobs. We propose a reinforcement learning-based method and a Bayesian optimization-based method to schedule devices for multiple jobs while minimizing the cost. We conduct extensive experimentation with multiple jobs and datasets. The experimental results show that our proposed approaches significantly outperform baseline approaches in terms of training time (up to 8.67 times faster) and accuracy (up to 44.6


page 1

page 2

page 3

page 4


Multi-Job Intelligent Scheduling with Cross-Device Federated Learning

Recent years have witnessed a large amount of decentralized data in vari...

Distributed Training for Deep Learning Models On An Edge Computing Network Using ShieldedReinforcement Learning

Edge devices with local computation capability has made distributed deep...

Just-in-Time Aggregation for Federated Learning

The increasing number and scale of federated learning (FL) jobs necessit...

Confederated Learning: Federated Learning with Decentralized Edge Servers

Federated learning (FL) is an emerging machine learning paradigm that al...

Wukong: A Scalable and Locality-Enhanced Framework for Serverless Parallel Computing

Serverless computing is increasingly being used for parallel computing, ...

An Analogy Based Method for Freight Forwarding Cost Estimation

The author explored estimation by analogy (EBA) as a means of estimating...

FedHiSyn: A Hierarchical Synchronous Federated Learning Framework for Resource and Data Heterogeneity

Federated Learning (FL) enables training a global model without sharing ...

Please sign up or login with your details

Forgot password? Click here to reset