HWTool: Fully Automatic Mapping of an Extensible C++ Image Processing Language to Hardware

10/23/2021
by   James Hegarty, et al.
0

Implementing image processing algorithms using FPGAs or ASICs can improve energy efficiency by orders of magnitude over optimized CPU, DSP, or GPU code. These efficiency improvements are crucial for enabling new applications on mobile power-constrained devices, such as cell phones or AR/VR headsets. Unfortunately, custom hardware is commonly implemented using a waterfall process with time-intensive manual mapping and optimization phases. Thus, it can take years for a new algorithm to make it all the way from an algorithm design to shipping silicon. Recent improvements in hardware design tools, such as C-to-gates High-Level Synthesis (HLS), can reduce design time, but still require manual tuning from hardware experts. In this paper, we present HWTool, a novel system for automatically mapping image processing and computer vision algorithms to hardware. Our system maps between two domains: HWImg, an extensible C++ image processing library containing common image processing and parallel computing operators, and Rigel2, a library of optimized hardware implementations of HWImg's operators and backend Verilog compiler. We show how to automatically compile HWImg to Rigel2, by solving for interfaces, hardware sizing, and FIFO buffer allocation. Finally, we map full-scale image processing applications like convolution, optical flow, depth from stereo, and feature descriptors to FPGA using our system. On these examples, HWTool requires on average only 11 than hand-optimized designs (with manual FIFO allocation), and 33 area than hand-optimized designs with automatic FIFO allocation, and performs similarly to HLS.

READ FULL TEXT

page 2

page 3

page 5

page 6

page 7

page 8

page 9

page 10

research
12/14/2021

FLOWER: A comprehensive dataflow compiler for high-level synthesis

FPGAs have found their way into data centers as accelerator cards, makin...
research
08/15/2023

SEER: Super-Optimization Explorer for HLS using E-graph Rewriting with MLIR

High-level synthesis (HLS) is a process that automatically translates a ...
research
02/26/2015

Automatic Optimization of Hardware Accelerators for Image Processing

In the domain of image processing, often real-time constraints are requi...
research
12/07/2021

Parallel Discrete Convolutions on Adaptive Particle Representations of Images

We present data structures and algorithms for native implementations of ...
research
12/10/2020

A Custom 7nm CMOS Standard Cell Library for Implementing TNN-based Neuromorphic Processors

A set of highly-optimized custom macro extensions is developed for a 7nm...
research
01/02/2020

A Machine Learning Imaging Core using Separable FIR-IIR Filters

We propose fixed-function neural network hardware that is designed to pe...

Please sign up or login with your details

Forgot password? Click here to reset