Cocktail: Cost-efficient and Data Skew-aware Online In-Network Distributed Machine Learning for Intelligent 5G and Beyond

by   Lingjun Pu, et al.

To facilitate the emerging applications in the 5G networks and beyond, mobile network operators will provide many powerful control functionalities such as RAN slicing and resource scheduling. These control functionalities generally comprise a series of prediction tasks such as channel state information prediction, cellular traffic prediction and user mobility prediction which will be enabled by machine learning (ML) techniques. However, training the ML models offline is inefficient, due to the excessive overhead for forwarding the huge volume of data samples from cellular networks to remote ML training clouds. Thanks to the promising edge computing paradigm, we advocate cooperative online in-network ML training across edge clouds. To alleviate the data skew issue caused by the capacity heterogeneity and dynamics of edge clouds while avoiding excessive overhead, we propose Cocktail, a cost-efficient and data skew-aware online in-network distributed machine learning framework. We build a comprehensive model and formulate an online data scheduling problem to optimize the framework cost while reconciling the data skew from both short-term and long-term perspective. We exploit the stochastic gradient descent to devise an online asymptotically optimal algorithm. As its core building block, we propose optimal policies based on novel graph constructions to respectively solve two subproblems. We also improve the proposed online algorithm with online learning for fast convergence of in-network ML training. A small-scale testbed and large-scale simulations validate the superior performance of our framework.


page 6

page 8

page 9

page 10

page 11

page 13

page 16

page 21


Distributed and Application-aware Task Scheduling in Edge-clouds

Edge computing is an emerging technology which places computing at the e...

OL4EL: Online Learning for Edge-cloud Collaborative Learning on Heterogeneous Edges with Resource Constraints

Distributed machine learning (ML) at network edge is a promising paradig...

Distributed Machine Learning through Heterogeneous Edge Systems

Many emerging AI applications request distributed machine learning (ML) ...

Cloudless-Training: A Framework to Improve Efficiency of Geo-Distributed ML Training

Geo-distributed ML training can benefit many emerging ML scenarios (e.g....

MLitB: Machine Learning in the Browser

With few exceptions, the field of Machine Learning (ML) research has lar...

Towards Carbon-Neutral Edge Computing: Greening Edge AI by Harnessing Spot and Future Carbon Markets

Provisioning dynamic machine learning (ML) inference as a service for ar...

Data-Importance Aware User Scheduling for Communication-Efficient Edge Machine Learning

With the prevalence of intelligent mobile applications, edge learning is...

Please sign up or login with your details

Forgot password? Click here to reset