Eventually-Consistent Federated Scheduling for Data Center Workloads

08/20/2023
by   Meghana Thiyyakat, et al.
0

Data center schedulers operate at unprecedented scales today to accommodate the growing demand for computing and storage power. The challenge that schedulers face is meeting the requirements of scheduling speeds despite the scale. To do so, most scheduler architectures use parallelism. However, these architectures consist of multiple parallel scheduling entities that can only utilize partial knowledge of the data center's state, as maintaining consistent global knowledge or state would involve considerable communication overhead. The disadvantage of scheduling without global knowledge is sub-optimal placements-tasks may be made to wait in queues even though there are resources available in zones outside the scope of the scheduling entity's state. This leads to unnecessary queuing overheads and lower resource utilization of the data center. In this paper, extend our previous work on Megha, a federated decentralized data center scheduling architecture that uses eventual consistency. The architecture utilizes both parallelism and an eventually-consistent global state in each of its scheduling entities to make fast decisions in a scalable manner. In our work, we compare Megha with 3 scheduling architectures: Sparrow, Eagle, and Pigeon, using simulation. We also evaluate Megha's prototype on a 123-node cluster and compare its performance with Pigeon's prototype using cluster traces. The results of our experiments show that Megha consistently reduces delays in job completion time when compared to other architectures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/15/2021

Megha: Decentralized Global Fair Scheduling for Federated Clusters

Increasing scale and heterogeneity in data centers have led to the devel...
research
04/27/2021

Pronto: Federated Task Scheduling

We present a federated, asynchronous, memory-limited algorithm for onlin...
research
12/13/2017

Reservation-Based Federated Scheduling for Parallel Real-Time Tasks

This paper considers the scheduling of parallel real-time tasks with arb...
research
09/23/2022

Optimal Job Scheduling and Bandwidth Augmentation in Hybrid Data Center Networks

Optimizing data transfers is critical for improving job performance in d...
research
02/14/2020

An optimal scheduling architecture for accelerating batch algorithms on Neural Network processor architectures

In neural network topologies, algorithms are running on batches of data ...
research
02/28/2020

Bringing Inter-Thread Cache Benefits to Federated Scheduling – Extended Results Technical Report

Multiprocessor scheduling of hard real-time tasks modeled by directed ac...
research
08/25/2021

A Case for Sampling Based Learning Techniques in Coflow Scheduling

Coflow scheduling improves data-intensive application performance by imp...

Please sign up or login with your details

Forgot password? Click here to reset