An Adaptive Performance-oriented Scheduler for Static and Dynamic Heterogeneity

05/02/2019
by   Jing Chen, et al.
0

With the emergence of heterogeneous hardware paving the way for the post-Moore era, it is of high importance to adapt the runtime scheduling to the platform's heterogeneity. To enhance adaptive and responsive scheduling, we introduce a Performance Trace Table (PTT) into XiTAO, a framework for elastic scheduling of mixed-mode parallelism. The PTT is an extensible and dynamic lightweight manifest of the per-core latency that can be used to guide the scheduling of both critical and non-critical tasks. By understanding the per-task latency, the PTT can infer task performance, intra-application interference as well as inter-application interference. We run random Direct Acyclic Graphs (DAGs) of different workload categories as a benchmark on NVIDIA Jetson TX2 chip, achieving up to 3.25x speedup over a standard work-stealing scheduler. To exemplify scheduling adaption to interference, we run DAGs with high parallelism and analyze the scheduler's response to interference from a background process on an Intel Haswell (2650v3) multicore workstation. We also showcase the XiTAO's scheduling performance by porting the VGG-16 image classification framework based on Convolutional Neural Networks (CNN).

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset