Transport Triggered Array Processor for Vision Applications

06/10/2019
by   Mehdi Safarpour, et al.
0

Low-level sensory data processing in many Internet-of-Things (IoT) devices pursue energy efficiency by utilizing sleep modes or slowing the clocking to the minimum. To curb the share of stand-by power dissipation in those designs, near-threshold/sub-threshold operational points or ultra-low-leakage processes in fabrication are employed. Those limit the clocking rates significantly, reducing the computing throughputs of individual processing cores. In this contribution we explore compensating for the performance loss of operating in near-threshold region (Vdd =0.6V) through massive parallelization. Benefits of near-threshold operation and massive parallelism are optimum energy consumption per instruction operation and minimized memory roundtrips, respectively. The Processing Elements (PE) of the design are based on Transport Triggered Architecture. The fine grained programmable parallel solution allows for fast and efficient computation of learnable low-level features (e.g. local binary descriptors and convolutions). Other operations, including Max-pooling have also been implemented. The programmable design achieves excellent energy efficiency for Local Binary Patterns computations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/30/2016

A near-threshold RISC-V core with DSP extensions for scalable IoT Endpoint Devices

Endpoint devices for Internet-of-Things not only need to work under extr...
research
05/18/2019

Low-power Programmable Processor for Fast Fourier Transform Based on Transport Triggered Architecture

This paper describes a low-power processor tailored for fast Fourier tra...
research
04/14/2020

Energy-Efficient Hardware-Accelerated Synchronization for Shared-L1-Memory Multiprocessor Clusters

The steeply growing performance demands for highly power- and energy-con...
research
10/18/2021

Vega: A 10-Core SoC for IoT End-Nodes with DNN Acceleration and Cognitive Wake-Up From MRAM-Based State-Retentive Sleep Mode

The Internet-of-Things requires end-nodes with ultra-low-power always-on...
research
05/02/2022

PSCNN: A 885.86 TOPS/W Programmable SRAM-based Computing-In-Memory Processor for Keyword Spotting

Computing-in-memory (CIM) has attracted significant attentions in recent...
research
04/04/2021

A Configurable BNN ASIC using a Network of Programmable Threshold Logic Standard Cells

This paper presents TULIP, a new architecture for a binary neural networ...
research
05/09/2021

RRCD: Redirección de Registros Basada en Compresión de Datos para Tolerar FallosPermanentes en una GPU

The ever-increasing parallelism demand of General-Purpose Graphics Proce...

Please sign up or login with your details

Forgot password? Click here to reset