SDRM3: A Dynamic Scheduler for Dynamic Real-time Multi-model ML Workloads

by   Seah Kim, et al.

Emerging real-time multi-model ML (RTMM) workloads such as AR/VR and drone control often involve dynamic behaviors in various levels; task, model, and layers (or, ML operators) within a model. Such dynamic behaviors are new challenges to the system software in an ML system because the overall system load is unpredictable unlike traditional ML workloads. Also, the real-time processing requires to meet deadlines, and multi-model workloads involve highly heterogeneous models. As RTMM workloads often run on resource-constrained devices (e.g., VR headset), developing an effective scheduler is an important research problem. Therefore, we propose a new scheduler, SDRM3, that effectively handles various dynamicity in RTMM style workloads targeting multi-accelerator systems. To make scheduling decisions, SDRM3 quantifies the unique requirements for RTMM workloads and utilizes the quantified scores to drive scheduling decisions, considering the current system load and other inference jobs on different models and input frames. SDRM3 has tunable parameters that provide fast adaptivity to dynamic workload changes based on a gradient descent-like online optimization, which typically converges within five steps for new workloads. In addition, we also propose a method to exploit model level dynamicity based on Supernet for exploiting the trade-off between the scheduling effectiveness and model performance (e.g., accuracy), which dynamically selects a proper sub-network in a Supernet based on the system loads. In our evaluation on five realistic RTMM workload scenarios, SDRM3 reduces the overall UXCost, which is a energy-delay-product (EDP)-equivalent metric for real-time applications defined in the paper, by 37.7 mean (up to 97.6 shows the efficacy of our scheduling methodology.


page 2

page 6

page 9

page 10


XRBench: An Extended Reality (XR) Machine Learning Benchmark Suite for the Metaverse

Real-time multi-model multi-task (MMMT) workloads, a new form of deep le...

RED: A Systematic Real-Time Scheduling Approach for Robotic Environmental Dynamics

Intelligent robots are designed to effectively navigate dynamic and unpr...

On-Device CPU Scheduling for Sense-React Systems

Sense-react systems (e.g. robotics and AR/VR) have to take highly respon...

An Efficient Online Prediction of Host Workloads Using Pruned GRU Neural Nets

Host load prediction is essential for dynamic resource scaling and job s...

Quantitative Verification of Scheduling Heuristics

Computer systems use many scheduling heuristics to allocate resources. U...

Use of Data Mining in Scheduler Optimization

The operating system's role in a computer system is to manage the variou...

Efficient Multi-stage Inference on Tabular Data

Many ML applications and products train on medium amounts of input data ...

Please sign up or login with your details

Forgot password? Click here to reset