Protecting Real-Time GPU Applications on Integrated CPU-GPU SoC Platforms

12/23/2017
by   Waqar Ali, et al.
0

Integrated CPU-GPU architecture provides excellent acceleration capabilities for data parallel applications on embedded platforms while meeting the size, weight and power (SWaP) requirements. However, sharing of main memory between CPU applications and GPU kernels can severely affect the execution of GPU kernels and diminish the performance gain provided by GPU. For example, in the NVIDIA Tegra K1 platform which has the integrated CPU-GPU architecture, we noticed that in the worst case scenario, the GPU kernels can suffer as much as 4X slowdown in the presence of co-running memory intensive CPU applications compared to their solo execution. In this paper, we propose a software mechanism, which we call BWLOCK++, to protect the performance of GPU kernels from co-scheduled memory intensive CPU applications.

READ FULL TEXT

page 1

page 4

page 7

page 9

research
12/23/2017

Protecting real-time GPU kernels on integrated CPU-GPU SoC platforms

Integrated CPU-GPU architecture provides excellent acceleration capabili...
research
01/25/2021

RTGPU: Real-Time GPU Scheduling of Hard Deadline Parallel Tasks with Fine-Grain Utilization

Many emerging cyber-physical systems, such as autonomous vehicles and ro...
research
05/09/2022

Towards a High-performance and Secure Memory System and Architecture for Emerging Applications

In this dissertation, we propose a memory and computing coordinated meth...
research
03/17/2020

Co-Optimizing Performance and Memory FootprintVia Integrated CPU/GPU Memory Management, anImplementation on Autonomous Driving Platform

Cutting-edge embedded system applications, such as self-driving cars and...
research
01/12/2023

HEP-BNN: A Framework for Finding Low-Latency Execution Configurations of BNNs on Heterogeneous Multiprocessor Platforms

Binarized Neural Networks (BNNs) significantly reduce the computation an...
research
08/01/2016

TREES: A CPU/GPU Task-Parallel Runtime with Explicit Epoch Synchronization

We have developed a task-parallel runtime system, called TREES, that is ...
research
05/27/2017

Fast MPEG-CDVS Encoder with GPU-CPU Hybrid Computing

The compact descriptors for visual search (CDVS) standard from ISO/IEC m...

Please sign up or login with your details

Forgot password? Click here to reset