AI Chat AI Image Generator AI Video Text to Speech

Characterizing Concurrency Mechanisms for NVIDIA GPUs under Deep Learning Workloads

10/01/2021

∙

by Guin Gilman, et al.

∙

∙

We investigate the performance of the concurrency mechanisms available on NVIDIA's new Ampere GPU microarchitecture under deep learning training and inference workloads. In contrast to previous studies that treat the GPU as a black box, we examine scheduling at the microarchitectural level. We find that the lack of fine-grained preemption mechanisms, robust task prioritization options, and contention-aware thread block placement policies limits the effectiveness of NVIDIA's concurrency mechanisms. In summary, the sequential nature of deep learning workloads and their fluctuating resource requirements and kernel runtimes make executing such workloads while maintaining consistently high utilization and low, predictable turnaround times difficult on current NVIDIA hardware.

Guin Gilman
1 publication
Robert J. Walls
7 publications

page 1

page 2

page 3

page 4

research

∙ 08/08/2021

Online Evolutionary Batch Size Orchestration for Scheduling Deep Learning Workloads in GPU Clusters

Efficient GPU resource scheduling is essential to maximize resource util...

0 Zhengda Bian, et al. ∙

research

∙ 09/13/2022

Deep Learning Training on Multi-Instance GPUs

Deep learning training is an expensive process that extensively uses GPU...

0 Anders Friis Kaas, et al. ∙

research

∙ 05/24/2022

Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision

Deep learning (DL) shows its prosperity in a wide variety of fields. The...

0 Wei Gao, et al. ∙

research

∙ 01/01/2023

MIGPerf: A Comprehensive Benchmark for Deep Learning Training and Inference Workloads on Multi-Instance GPUs

New architecture GPUs like A100 are now equipped with multi-instance GPU...

0 Huaizheng Zhang, et al. ∙

research

∙ 01/19/2022

Building a Performance Model for Deep Learning Recommendation Model Training on GPUs

We devise a performance model for GPU training of Deep Learning Recommen...

7 Zhongyi Lin, et al. ∙

research

∙ 11/18/2018

Analyzing Machine Learning Workloads Using a Detailed GPU Simulator

Most deep neural networks deployed today are trained using GPUs via high...

0 Jonathan Lew, et al. ∙

research

∙ 01/27/2022

Prediction of GPU Failures Under Deep Learning Workloads

Graphics processing units (GPUs) are the de facto standard for processin...

0 Heting Liu, et al. ∙