MIPS: Instance Placement for Stream Processing Systems based on Monte Carlo Tree Search

08/01/2020
by   Xi Huang, et al.
0

Stream processing engines enable modern systems to conduct large-scale analytics over unbounded data streams in real time. They often view an application as a direct acyclic graph with streams flowing through pipelined instances of various processing units. One key challenge that emerges is instance placement, i.e., to decide the placement of instances across servers with minimum traffic across servers and maximum resource utilization. The challenge roots in not only its intrinsic complexity but also the impact between successive application deployments. Most updated engines such as Apache Heron exploits a more modularized scheduler design that decomposes the task into two stages: One decides the instance-to-container mapping while the other focuses on the container-to-server mapping that is delegated to standalone resource managers. The unaligned objectives and scheduler designs in the two stages may lead to long response times or low utilization. However, so far little work has appeared to address the challenge. Inspired by the recent success of applications of Monte Carlo Tree Search (MCTS) methods in various fields, we develop a novel model to characterize such systems, formulate the problem, and cast each stage of mapping into a sequential decision process. By adopting MCTS methods, we propose MIPS, an MCTS-based Instance Placement Scheme to decide the two-staged mapping in a timely yet efficient manner. In addition, we discuss practical issues and refine MIPS to further improve its performance. Results from extensive simulations show, given mild-value of samples, MIPS outperforms existing schemes with a significant traffic reduction and utilization improvement. To our best knowledge, this paper is the first to study the two-staged mapping problem and to apply MCTS to solving the challenge.

READ FULL TEXT

page 1

page 6

research
08/01/2020

POTUS: Predictive Online Tuple Scheduling for Data Stream Processing Systems

Most online service providers deploy their own data stream processing sy...
research
12/08/2020

Placement is not Enough: Embedding with Proactive Stream Mapping on the Heterogenous Edge

Edge computing is naturally suited to the applications generated by Inte...
research
04/08/2023

Improving Performance Insensitivity of Large-scale Multiobjective Optimization via Monte Carlo Tree Search

The large-scale multiobjective optimization problem (LSMOP) is character...
research
02/26/2023

Towards Tackling MaxSAT by Combining Nested Monte Carlo with Local Search

Recent work proposed the UCTMAXSAT algorithm to address Maximum Satisfia...
research
10/28/2018

P-MCGS: Parallel Monte Carlo Acyclic Graph Search

Recently, there have been great interests in Monte Carlo Tree Search (MC...
research
06/13/2022

Deadline-constrained Multi-resource Task Mapping and Allocation for Edge-Cloud Systems

In an edge-cloud system, mobile devices can offload their computation in...
research
11/27/2020

Net2: A Graph Attention Network Method Customized for Pre-Placement Net Length Estimation

Net length is a key proxy metric for optimizing timing and power across ...

Please sign up or login with your details

Forgot password? Click here to reset