ACIQ: Analytical Clipping for Integer Quantization of neural networks

10/02/2018
by   Ron Banner, et al.
0

Unlike traditional approaches that focus on the quantization at the network level, in this work we propose to minimize the quantization effect at the tensor level. We analyze the trade-off between quantization noise and clipping distortion in low precision networks. We identify the statistics of various tensors, and derive exact expressions for the mean-square-error degradation due to clipping. By optimizing these expressions, we show marked improvements over standard quantization schemes that normally avoid clipping. For example, just by choosing the accurate clipping values, more than 40% accuracy improvement is obtained for the quantization of VGG16-BN to 4-bits of precision. Our results have many applications for the quantization of neural networks at both training and inference time. One immediate application is for a rapid deployment of neural networks to low-precision accelerators without time-consuming fine tuning or the availability of the full datasets.

READ FULL TEXT
research
10/12/2018

Quantization for Rapid Deployment of Deep Neural Networks

This paper aims at rapid deployment of the state-of-the-art deep neural ...
research
02/18/2019

Low-bit Quantization of Neural Networks for Efficient Inference

Recent breakthrough methods in machine learning make use of increasingly...
research
07/20/2022

Mixed-Precision Inference Quantization: Radically Towards Faster inference speed, Lower Storage requirement, and Lower Loss

Based on the model's resilience to computational noise, model quantizati...
research
06/13/2022

Optimal Clipping and Magnitude-aware Differentiation for Improved Quantization-aware Training

Data clipping is crucial in reducing noise in quantization operations an...
research
10/30/2020

Time regularization as a solution to mitigate quantization induced performance degradation

Reset control is known to be able to outperform PID and the like linear ...
research
02/18/2020

Robust Quantization: One Model to Rule Them All

Neural network quantization methods often involve simulating the quantiz...
research
10/13/2019

Overwrite Quantization: Opportunistic Outlier Handling for Neural Network Accelerators

Outliers in weights and activations pose a key challenge for fixed-point...

Please sign up or login with your details

Forgot password? Click here to reset