RedBit: An End-to-End Flexible Framework for Evaluating the Accuracy of Quantized CNNs

by   André Santos, et al.

In recent years, Convolutional Neural Networks (CNNs) have become the standard class of deep neural network for image processing, classification and segmentation tasks. However, the large strides in accuracy obtained by CNNs have been derived from increasing the complexity of network topologies, which incurs sizeable performance and energy penalties in the training and inference of CNNs. Many recent works have validated the effectiveness of parameter quantization, which consists in reducing the bit width of the network's parameters, to enable the attainment of considerable performance and energy efficiency gains without significantly compromising accuracy. However, it is difficult to compare the relative effectiveness of different quantization methods. To address this problem, we introduce RedBit, an open-source framework that provides a transparent, extensible and easy-to-use interface to evaluate the effectiveness of different algorithms and parameter configurations on network accuracy. We use RedBit to perform a comprehensive survey of five state-of-the-art quantization methods applied to the MNIST, CIFAR-10 and ImageNet datasets. We evaluate a total of 2300 individual bit width combinations, independently tuning the width of the network's weight and input activation parameters, from 32 bits down to 1 bit (e.g., 8/8, 2/2, 1/32, 1/1, for weights/activations). Upwards of 20000 hours of computing time in a pool of state-of-the-art GPUs were used to generate all the results in this paper. For 1-bit quantization, the accuracy losses for the MNIST, CIFAR-10 and ImageNet datasets range between [0.26 top-1, respectively. We actively encourage the reader to download the source code and experiment with RedBit, and to submit their own observed results to our public repository, available at


page 4

page 8

page 10

page 11

page 12


LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks

Although weight and activation quantization is an effective approach for...

A Comprehensive Survey on Model Quantization for Deep Neural Networks

Recent advances in machine learning by deep neural networks are signific...

Effective Quantization Methods for Recurrent Neural Networks

Reducing bit-widths of weights, activations, and gradients of a Neural N...

MultiQuant: A Novel Multi-Branch Topology Method for Arbitrary Bit-width Network Quantization

Arbitrary bit-width network quantization has received significant attent...

AdaBits: Neural Network Quantization with Adaptive Bit-Widths

Deep neural networks with adaptive configurations have gained increasing...

Neural Network Activation Quantization with Bitwise Information Bottlenecks

Recent researches on information bottleneck shed new light on the contin...

Bayesian Optimized 1-Bit CNNs

Deep convolutional neural networks (DCNNs) have dominated the recent dev...

Please sign up or login with your details

Forgot password? Click here to reset