A Meta Reinforcement Learning Approach for Predictive Autoscaling in the Cloud

by   Siqiao Xue, et al.

Predictive autoscaling (autoscaling with workload forecasting) is an important mechanism that supports autonomous adjustment of computing resources in accordance with fluctuating workload demands in the Cloud. In recent works, Reinforcement Learning (RL) has been introduced as a promising approach to learn the resource management policies to guide the scaling actions under the dynamic and uncertain cloud environment. However, RL methods face the following challenges in steering predictive autoscaling, such as lack of accuracy in decision-making, inefficient sampling and significant variability in workload patterns that may cause policies to fail at test time. To this end, we propose an end-to-end predictive meta model-based RL algorithm, aiming to optimally allocate resource to maintain a stable CPU utilization level, which incorporates a specially-designed deep periodic workload prediction model as the input and embeds the Neural Process to guide the learning of the optimal scaling actions over numerous application services in the Cloud. Our algorithm not only ensures the predictability and accuracy of the scaling strategy, but also enables the scaling decisions to adapt to the changing workloads with high sample efficiency. Our method has achieved significant performance improvement compared to the existing algorithms and has been deployed online at Alipay, supporting the autoscaling of applications for the world-leading payment platform.


page 1

page 2

page 3

page 4


Scaling Serverless Functions in Edge Networks: A Reinforcement Learning Approach

With rapid advances in containerization techniques, the serverless compu...

Reinforcement Learning-based Autoscaling of Workflows in the Cloud: A Survey

Reinforcement Learning (RL) has demonstrated a great potential for autom...

Full Scaling Automation for Sustainable Development of Green Data Centers

The rapid rise in cloud computing has resulted in an alarming increase i...

Performance-Aware Management of Cloud Resources: A Taxonomy and Future Directions

Dynamic nature of the cloud environment has made distributed resource ma...

Policy Search using Dynamic Mirror Descent MPC for Model Free Off Policy RL

Recent works in Reinforcement Learning (RL) combine model-free (Mf)-RL a...

An Efficient Online Prediction of Host Workloads Using Pruned GRU Neural Nets

Host load prediction is essential for dynamic resource scaling and job s...

CPU frequency scheduling of real-time applications on embedded devices with temporal encoding-based deep reinforcement learning

Small devices are frequently used in IoT and smart-city applications to ...

Please sign up or login with your details

Forgot password? Click here to reset