Union: A Unified HW-SW Co-Design Ecosystem in MLIR for Evaluating Tensor Operations on Spatial Accelerators

09/15/2021
by   Geonhwa Jeong, et al.
0

To meet the extreme compute demands for deep learning across commercial and scientific applications, dataflow accelerators are becoming increasingly popular. While these "domain-specific" accelerators are not fully programmable like CPUs and GPUs, they retain varying levels of flexibility with respect to data orchestration, i.e., dataflow and tiling optimizations to enhance efficiency. There are several challenges when designing new algorithms and mapping approaches to execute the algorithms for a target problem on new hardware. Previous works have addressed these challenges individually. To address this challenge as a whole, in this work, we present a HW-SW co-design ecosystem for spatial accelerators called Union within the popular MLIR compiler infrastructure. Our framework allows exploring different algorithms and their mappings on several accelerator cost models. Union also includes a plug-and-play library of accelerator cost models and mappers which can easily be extended. The algorithms and accelerator cost models are connected via a novel mapping abstraction that captures the map space of spatial accelerators which can be systematically pruned based on constraints from the hardware, workload, and mapper. We demonstrate the value of Union for the community with several case studies which examine offloading different tensor operations(CONV/GEMM/Tensor Contraction) on diverse accelerator architectures using different mapping schemes.

READ FULL TEXT
research
07/11/2018

VTA: An Open Hardware-Software Stack for Deep Learning

Hardware acceleration is an enabler for ubiquitous and efficient deep le...
research
05/26/2021

Compiling Halide Programs to Push-Memory Accelerators

Image processing and machine learning applications benefit tremendously ...
research
02/18/2020

Marvel: A Data-centric Compiler for DNN Operators on Spatial Accelerators

The efficiency of a spatial DNN accelerator depends heavily on the compi...
research
04/20/2020

Agile Autotuning of a Transprecision Tensor Accelerator Overlay for TVM Compiler Stack

Specialized accelerators for tensor-operations, such as blocked-matrix o...
research
05/12/2022

Sparseloop: An Analytical Approach To Sparse Tensor Accelerator Modeling

In recent years, many accelerators have been proposed to efficiently pro...
research
04/01/2020

User-Space Emulation Framework for Domain-Specific SoC Design

In this work, we propose a portable, Linux-based emulation framework to ...
research
06/19/2021

Evaluating Spatial Accelerator Architectures with Tiled Matrix-Matrix Multiplication

There is a growing interest in custom spatial accelerators for machine l...

Please sign up or login with your details

Forgot password? Click here to reset