High level programming abstractions for leveraging hierarchical memories with micro-core architectures

10/04/2020
by   Maurice Jamieson, et al.
0

Micro-core architectures combine many low memory, low power computing cores together in a single package. These are attractive for use as accelerators but due to limited on-chip memory and multiple levels of memory hierarchy, the way in which programmers offload kernels needs to be carefully considered. In this paper we use Python as a vehicle for exploring the semantics and abstractions of higher level programming languages to support the offloading of computational kernels to these devices. By moving to a pass by reference model, along with leveraging memory kinds, we demonstrate the ability to easily and efficiently take advantage of multiple levels in the memory hierarchy, even ones that are not directly accessible to the micro-cores. Using a machine learning benchmark, we perform experiments on both Epiphany-III and MicroBlaze based micro-cores, demonstrating the ability to compute with data sets of arbitrarily large size. To provide context of our results, we explore the performance and power efficiency of these technologies, demonstrating that whilst these two micro-core technologies are competitive within their own embedded class of hardware, there is still a way to go to reach HPC class GPUs.

READ FULL TEXT
research
02/03/2021

Compact Native Code Generation for Dynamic Languages on Micro-core Architectures

Micro-core architectures combine many simple, low memory, low power-cons...
research
10/28/2020

ePython: An implementation of Python for the many-core Epiphany coprocessor

The Epiphany is a many-core, low power, low on-chip memory architecture ...
research
11/10/2020

Benchmarking micro-core architectures for detecting disasters at the edge

Leveraging real-time data to detect disasters such as wildfires, extreme...
research
01/13/2022

SparseP: Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Systems

Several manufacturers have already started to commercialize near-bank Pr...
research
09/25/2020

Flexible Performant GEMM Kernels on GPUs

General Matrix Multiplication or GEMM kernels take centre place in high ...
research
02/17/2021

Performance Optimizations of Recursive Electronic Structure Solvers targeting Multi-Core Architectures (LA-UR-20-26665)

As we rapidly approach the frontiers of ultra large computing resources,...
research
04/09/2023

Portability and Scalability of OpenMP Offloading on State-of-the-art Accelerators

Over the last decade, most of the increase in computing power has been g...

Please sign up or login with your details

Forgot password? Click here to reset