High-Performance and Energy-Effcient Memory Scheduler Design for Heterogeneous Systems

by   Rachata Ausavarungnirun, et al.

When multiple processor cores (CPUs) and a GPU integrated together on the same chip share the off-chip DRAM, requests from the GPU can heavily interfere with requests from the CPUs, leading to low system performance and starvation of cores. Unfortunately, state-of-the-art memory scheduling algorithms are ineffective at solving this problem due to the very large amount of GPU memory traffic, unless a very large and costly request buffer is employed to provide these algorithms with enough visibility across the global request stream. Previously-proposed memory controller (MC) designs use a single monolithic structure to perform three main tasks. First, the MC attempts to schedule together requests to the same DRAM row to increase row buffer hit rates. Second, the MC arbitrates among the requesters (CPUs and GPU) to optimize for overall system throughput, average response time, fairness and quality of service. Third, the MC manages the low-level DRAM command scheduling to complete requests while ensuring compliance with all DRAM timing and power constraints. This paper proposes a fundamentally new approach, called the Staged Memory Scheduler (SMS), which decouples the three primary MC tasks into three significantly simpler structures that together improve system performance and fairness. Our evaluation shows that SMS provides 41.2 improvement and fairness improvement compared to the best previous state-of-the-art technique, while enabling a design that is significantly less complex and more power-efficient to implement.


Exploiting the DRAM Microarchitecture to Increase Memory-Level Parallelism

This paper summarizes the idea of Subarray-Level Parallelism (SALP) in D...

HiRA: Hidden Row Activation for Reducing Refresh Latency of Off-the-Shelf DRAM Chips

DRAM is the building block of modern main memory systems. DRAM cells mus...

Ohm-GPU: Integrating New Optical Network and Heterogeneous Memory into GPU Multi-Processors

Traditional graphics processing units (GPUs) suffer from the low memory ...

CADS: Core-Aware Dynamic Scheduler for Multicore Memory Controllers

Memory controller scheduling is crucial in multicore processors, where D...

Mithril: Cooperative Row Hammer Protection on Commodity DRAM Leveraging Managed Refresh

Since its public introduction in the mid-2010s, the Row Hammer (RH) phen...

Evaluating Row Buffer Locality in Future Non-Volatile Main Memories

DRAM-based main memories have read operations that destroy the read data...

Criticality Aware Multiprocessors

Typically, a memory request from a processor may need to go through many...

Please sign up or login with your details

Forgot password? Click here to reset