Sparse linear algebra is crucial in many application domains, but challe...
This document presents implementations of fundamental convolutional neur...
Sparse-dense linear algebra is crucial in many domains, but challenging ...
On-chip communication infrastructure is a central component of modern
sy...
Data-parallel problems demand ever growing floating-point (FP) operation...
The slowdown of Moore's law and the power wall necessitates a shift towa...
Modern Hardware Description Languages (HDLs) such as SystemVerilog or VH...
Data-parallel applications, such as data analytics, machine learning, an...
Single-issue processor cores are very energy efficient but suffer from t...
In this paper, we present Ara, a 64-bit vector processor based on the ve...
Specialized coprocessors for Multiply-Accumulate (MAC) intensive workloa...
Most investigations into near-memory hardware accelerators for deep neur...