Andreas Moshovos

research

∙ 04/28/2022

Schrödinger's FP: Dynamic Adaptation of Floating-Point Containers for Deep Learning Training

We introduce a software-hardware co-design approach to reduce memory tra...

0 Milos Nikolic, et al. ∙

research

∙ 03/23/2022

Mokey: Enabling Narrow Fixed-Point Inference for Out-of-the-Box Floating-Point Transformer Models

Increasingly larger and better Transformer models keep advancing state-o...

0 Ali Hadi Zadeh, et al. ∙

research

∙ 01/21/2022

APack: Off-Chip, Lossless Data Compression for Efficient Deep Learning Inference

Data accesses between on- and off-chip memories account for a large frac...

0 Alberto Delmas Lascorz, et al. ∙

research

∙ 10/15/2020

FPRaker: A Processing Element For Accelerating Neural Network Training

We present FPRaker, a processing element for composing training accelera...

0 Omar Mohamed Awad, et al. ∙

research

∙ 09/01/2020

TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training and Inference

TensorDash is a hardware level technique for enabling data-parallel MAC ...

0 Mostafa Mahmoud, et al. ∙

research

∙ 05/08/2020

GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference

Attention-based models have demonstrated remarkable success in various n...

0 Ali Hadi Zadeh, et al. ∙

research

∙ 02/08/2020

BitPruning: Learning Bitlengths for Aggressive and Accurate Quantization

Neural networks have demonstrably achieved state-of-the art accuracy usi...

15 Milos Nikolic, et al. ∙

research

∙ 10/15/2019

Training CNNs faster with Dynamic Input and Kernel Downsampling

We reduce training time in convolutional networks (CNNs) with a method t...

0 Zissis Poulos, et al. ∙

research

∙ 05/10/2018

Laconic Deep Learning Computing

We motivate a method for transparently identifying ineffectual computati...

0 Sayeh Sharify, et al. ∙

research

∙ 04/17/2018

DPRed: Making Typical Activation Values Matter In Deep Learning Computing

We show that selecting a fixed precision for all activations in Convolut...

0 Alberto Delmas, et al. ∙

research

∙ 03/09/2018

Bit-Tactical: Exploiting Ineffectual Computations in Convolutional Neural Networks: Which, Why, and How

We show that, during inference with Convolutional Neural Networks (CNNs)...

0 Alberto Delmas, et al. ∙

research

∙ 07/27/2017

Tartan: Accelerating Fully-Connected and Convolutional Layers in Deep Learning Networks by Exploiting Numerical Precision Variability

Tartan (TRT), a hardware accelerator for inference with Deep Neural Netw...

0 Alberto Delmas, et al. ∙

research

∙ 06/23/2017

Loom: Exploiting Weight and Activation Precisions to Accelerate Convolutional Neural Networks

Loom (LM), a hardware inference accelerator for Convolutional Neural Net...

0 Sayeh Sharify, et al. ∙

research

∙ 06/01/2017

Dynamic Stripes: Exploiting the Dynamic Precision Requirements of Activation Values in Neural Networks

Stripes is a Deep Neural Network (DNN) accelerator that uses bit-serial ...

0 Alberto Delmas, et al. ∙

research

∙ 11/30/2016

Memory Controller Design Under Cloud Workloads

This work studies the behavior of state-of-the-art memory controller des...

0 Mostafa Mahmoud, et al. ∙

research

∙ 11/17/2015

Reduced-Precision Strategies for Bounded Memory in Deep Neural Nets

This work investigates how using reduced precision data in Convolutional...

0 Patrick Judd, et al. ∙

Andreas Moshovos

Featured Co-authors

Sign in with Google

Consider DeepAI Pro