Pruning vs Quantization: Which is Better?

07/06/2023
by   Andrey Kuzmin, et al.
0

Neural network pruning and quantization techniques are almost as old as neural networks themselves. However, to date only ad-hoc comparisons between the two have been published. In this paper, we set out to answer the question on which is better: neural network quantization or pruning? By answering this question, we hope to inform design decisions made on neural network hardware going forward. We provide an extensive comparison between the two techniques for compressing deep neural networks. First, we give an analytical comparison of expected quantization and pruning error for general data distributions. Then, we provide lower bounds for the per-layer pruning and quantization error in trained networks, and compare these to empirical error after optimization. Finally, we provide an extensive experimental comparison for training 8 large-scale models on 3 tasks. Our results show that in most cases quantization outperforms pruning. Only in some scenarios with very high compression ratio, pruning might be beneficial from an accuracy standpoint.

READ FULL TEXT

page 3

page 4

page 18

research
02/03/2020

Automatic Pruning for Quantized Neural Networks

Neural network quantization and pruning are two techniques commonly used...
research
10/15/2021

Training Deep Neural Networks with Joint Quantization and Pruning of Weights and Activations

Quantization and pruning are core techniques used to reduce the inferenc...
research
10/14/2020

Towards Accurate Quantization and Pruning via Data-free Knowledge Transfer

When large scale training data is available, one can obtain compact and ...
research
07/20/2020

Differentiable Joint Pruning and Quantization for Hardware Efficiency

We present a differentiable joint pruning and quantization (DJPQ) scheme...
research
06/14/2018

Scalable Neural Network Compression and Pruning Using Hard Clustering and L1 Regularization

We propose a simple and easy to implement neural network compression alg...
research
10/05/2020

Joint Pruning Quantization for Extremely Sparse Neural Networks

We investigate pruning and quantization for deep neural networks. Our go...
research
08/13/2023

A Survey on Deep Neural Network Pruning-Taxonomy, Comparison, Analysis, and Recommendations

Modern deep neural networks, particularly recent large language models, ...

Please sign up or login with your details

Forgot password? Click here to reset