SYMOG: learning symmetric mixture of Gaussian modes for improved fixed-point quantization

02/19/2020
by   Lukas Enderich, et al.
9

Deep neural networks (DNNs) have been proven to outperform classical methods on several machine learning benchmarks. However, they have high computational complexity and require powerful processing units. Especially when deployed on embedded systems, model size and inference time must be significantly reduced. We propose SYMOG (symmetric mixture of Gaussian modes), which significantly decreases the complexity of DNNs through low-bit fixed-point quantization. SYMOG is a novel soft quantization method such that the learning task and the quantization are solved simultaneously. During training the weight distribution changes from an unimodal Gaussian distribution to a symmetric mixture of Gaussians, where each mean value belongs to a particular fixed-point mode. We evaluate our approach with different architectures (LeNet5, VGG7, VGG11, DenseNet) on common benchmark data sets (MNIST, CIFAR-10, CIFAR-100) and we compare with state-of-the-art quantization approaches. We achieve excellent results and outperform 2-bit state-of-the-art performance with an error rate of only 5.71

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/16/2019

Learning Multimodal Fixed-Point Weights using Gradient Descent

Due to their high computational complexity, deep neural networks are sti...
research
06/11/2019

Data-Free Quantization through Weight Equalization and Bias Correction

We introduce a data-free quantization method for deep neural networks th...
research
11/01/2019

Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Recent emerged quantization technique (i.e., using low bit-width fixed-p...
research
08/01/2023

MRQ:Support Multiple Quantization Schemes through Model Re-Quantization

Despite the proliferation of diverse hardware accelerators (e.g., NPU, T...
research
05/27/2020

Accelerating Neural Network Inference by Overflow Aware Quantization

The inherent heavy computation of deep neural networks prevents their wi...
research
03/04/2023

Fixed-point quantization aware training for on-device keyword-spotting

Fixed-point (FXP) inference has proven suitable for embedded devices wit...
research
02/24/2021

FIXAR: A Fixed-Point Deep Reinforcement Learning Platform with Quantization-Aware Training and Adaptive Parallelism

In this paper, we present a deep reinforcement learning platform named F...

Please sign up or login with your details

Forgot password? Click here to reset