Unified Data-Free Compression: Pruning and Quantization without Fine-Tuning

08/14/2023
by   Shipeng Bai, et al.
0

Structured pruning and quantization are promising approaches for reducing the inference time and memory footprint of neural networks. However, most existing methods require the original training dataset to fine-tune the model. This not only brings heavy resource consumption but also is not possible for applications with sensitive or proprietary data due to privacy and security concerns. Therefore, a few data-free methods are proposed to address this problem, but they perform data-free pruning and quantization separately, which does not explore the complementarity of pruning and quantization. In this paper, we propose a novel framework named Unified Data-Free Compression(UDFC), which performs pruning and quantization simultaneously without any data and fine-tuning process. Specifically, UDFC starts with the assumption that the partial information of a damaged(e.g., pruned or quantized) channel can be preserved by a linear combination of other channels, and then derives the reconstruction form from the assumption to restore the information loss due to compression. Finally, we formulate the reconstruction error between the original network and its compressed network, and theoretically deduce the closed-form solution. We evaluate the UDFC on the large-scale image classification task and obtain significant improvements over various network architectures and compression methods. For example, we achieve a 20.54 accuracy improvement on ImageNet dataset compared to SOTA method with 30 pruning ratio and 6-bit quantization on ResNet-34.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/02/2023

Data-Free Quantization via Mixed-Precision Compensation without Fine-Tuning

Neural network quantization is a very promising solution in the field of...
research
06/22/2023

Data-Free Backbone Fine-Tuning for Pruned Neural Networks

Model compression techniques reduce the computational load and memory co...
research
10/14/2020

Towards Accurate Quantization and Pruning via Data-free Knowledge Transfer

When large scale training data is available, one can obtain compact and ...
research
07/12/2019

And the Bit Goes Down: Revisiting the Quantization of Neural Networks

In this paper, we address the problem of reducing the memory footprint o...
research
05/10/2019

Compressing Weight-updates for Image Artifacts Removal Neural Networks

In this paper, we present a novel approach for fine-tuning a decoder-sid...
research
04/02/2022

Paoding: Supervised Robustness-preserving Data-free Neural Network Pruning

When deploying pre-trained neural network models in real-world applicati...
research
01/17/2021

Network Automatic Pruning: Start NAP and Take a Nap

Network pruning can significantly reduce the computation and memory foot...

Please sign up or login with your details

Forgot password? Click here to reset