A High-performance, Energy-efficient Modular DMA Engine Architecture

05/09/2023
by   Thomas Benz, et al.
0

Data transfers are essential in today's computing systems as latency and complex memory access patterns are increasingly challenging to manage. Direct memory access engines (DMAEs) are critically needed to transfer data independently of the processing elements, hiding latency and achieving high throughput even for complex access patterns to high-latency memory. With the prevalence of heterogeneous systems, DMAEs must operate efficiently in increasingly diverse environments. This work proposes a modular and highly configurable open-source DMAE architecture called intelligent DMA (iDMA), split into three parts that can be composed and customized independently. The front-end implements the control plane binding to the surrounding system. The mid-end accelerates complex data transfer patterns such as multi-dimensional transfers, scattering, or gathering. The back-end interfaces with the on-chip communication fabric (data plane). We assess the efficiency of iDMA in various instantiations: In high-performance systems, we achieve speedups of up to 15.8x with only 1 achieve an area reduction of 10 23 provide area, timing, latency, and performance characterization to guide its instantiation in various systems.

READ FULL TEXT

page 4

page 13

page 14

research
09/11/2020

An Open-Source Platform for High-Performance Non-Coherent On-Chip Communication

On-chip communication infrastructure is a central component of modern sy...
research
10/16/2018

On the Off-chip Memory Latency of Real-Time Systems: Is DDR DRAM Really the Best Option?

Predictable execution time upon accessing shared memories in multi-core ...
research
10/24/2022

SpikeSim: An end-to-end Compute-in-Memory Hardware Evaluation Tool for Benchmarking Spiking Neural Networks

SNNs are an active research domain towards energy efficient machine inte...
research
03/18/2021

Hazelcast Jet: Low-latency Stream Processing at the 99.99th Percentile

Jet is an open-source, high-performance, distributed stream processor bu...
research
09/21/2019

Gene-Patterns: Should Architecture be Customized for Each Application?

Providing architectural support is crucial for newly arising application...
research
02/07/2020

Breaking Band: A Breakdown of High-performance Communication

The critical path of internode communication on large-scale systems is c...
research
09/23/2022

Concurrent Graph Queries on the Lucata Pathfinder

High-performance analysis of unstructured data like graphs now is critic...

Please sign up or login with your details

Forgot password? Click here to reset