Cloud-scale VM Deflation for Running Interactive Applications On Transient Servers

by   Alexander Fuerst, et al.

Transient computing has become popular in public cloud environments for running delay-insensitive batch and data processing applications at low cost. Since transient cloud servers can be revoked at any time by the cloud provider, they are considered unsuitable for running interactive application such as web services. In this paper, we present VM deflation as an alternative mechanism to server preemption for reclaiming resources from transient cloud servers under resource pressure. Using real traces from top-tier cloud providers, we show the feasibility of using VM deflation as a resource reclamation mechanism for interactive applications in public clouds. We show how current hypervisor mechanisms can be used to implement VM deflation and present cluster deflation policies for resource management of transient and on-demand cloud VMs. Experimental evaluation of our deflation system on a Linux cluster shows that microservice-based applications can be deflated by up to 50% with negligible performance overhead. Our cluster-level deflation policies allow overcommitment levels as high as 50%, with less than a 1% decrease in application throughput, and can enable cloud platforms to increase revenue by 30%.


page 2

page 3

page 4

page 6

page 7

page 10

page 11

page 12


Characterizing and Modeling Distributed Training with Transient Cloud GPU Servers

Cloud GPU servers have become the de facto way for deep learning practit...

SpotTune: Leveraging Transient Resources for Cost-efficient Hyper-parameter Tuning in the Public Cloud

Hyper-parameter tuning (HPT) is crucial for many machine learning (ML) a...

n-m-Variant Systems: Adversarial-Resistant Software Rejuvenation for Cloud-Based Web Applications

Web servers are a popular target for adversaries as they are publicly ac...

Memtrade: A Disaggregated-Memory Marketplace for Public Clouds

We present Memtrade, the first memory disaggregation system for public c...

HyperNAT: Scaling Up Network AddressTranslation with SmartNICs for Clouds

Network address translation (NAT) is a basic functionality in cloud gate...

Effect of Human Learning on the Transient Performance of Cloud-based Tiered Applications

Cloud based tiered applications are increasingly becoming popular, be it...

Software Aging Analysis of Web Server Using Neural Networks

Software aging is a phenomenon that refers to progressive performance de...

Please sign up or login with your details

Forgot password? Click here to reset