LSQ+: Improving low-bit quantization through learnable offsets and better initialization

by   Yash Bhalgat, et al.

Unlike ReLU, newer activation functions (like Swish, H-swish, Mish) that are frequently employed in popular efficient architectures can also result in negative activation values, with skewed positive and negative ranges. Typical learnable quantization schemes [PACT, LSQ] assume unsigned quantization for activations and quantize all negative activations to zero which leads to significant loss in performance. Naively using signed quantization to accommodate these negative values requires an extra sign bit which is expensive for low-bit (2-, 3-, 4-bit) quantization. To solve this problem, we propose LSQ+, a natural extension of LSQ, wherein we introduce a general asymmetric quantization scheme with trainable scale and offset parameters that can learn to accommodate the negative activations. Gradient-based learnable quantization schemes also commonly suffer from high instability or variance in the final training performance, hence requiring a great deal of hyper-parameter tuning to reach a satisfactory performance. LSQ+ alleviates this problem by using an MSE-based initialization scheme for the quantization parameters. We show that this initialization leads to significantly lower variance in final performance across multiple training runs. Overall, LSQ+ shows state-of-the-art results for EfficientNet and MixNet and also significantly outperforms LSQ for low-bit quantization of neural nets with Swish activations (e.g.: 1.8 quantization and upto 5.6 ImageNet dataset). To the best of our knowledge, ours is the first work to quantize such architectures to extremely low bit-widths.


page 1

page 2

page 3

page 4


Semi-Relaxed Quantization with DropBits: Training Low-Bit Neural Networks via Bit-wise Regularization

Neural Network quantization, which aims to reduce bit-lengths of the net...

Training Multi-bit Quantized and Binarized Networks with A Learnable Symmetric Quantizer

Quantizing weights and activations of deep neural networks is essential ...

PROFIT: A Novel Training Method for sub-4-bit MobileNet Models

4-bit and lower precision mobile models are required due to the ever-inc...

Efficient Activation Quantization via Adaptive Rounding Border for Post-Training Quantization

Post-training quantization (PTQ) attracts increasing attention due to it...

Scalable Methods for 8-bit Training of Neural Networks

Quantized Neural Networks (QNNs) are often used to improve network effic...

Analysis of Quantization on MLP-based Vision Models

Quantization is wildly taken as a model compression technique, which obt...

Dynamic Dual Trainable Bounds for Ultra-low Precision Super-Resolution Networks

Light-weight super-resolution (SR) models have received considerable att...

Please sign up or login with your details

Forgot password? Click here to reset