Toward Compact Parameter Representations for Architecture-Agnostic Neural Network Compression

by   Yuezhou Sun, et al.

This paper investigates deep neural network (DNN) compression from the perspective of compactly representing and storing trained parameters. We explore the previously overlooked opportunity of cross-layer architecture-agnostic representation sharing for DNN parameters. To do this, we decouple feedforward parameters from DNN architectures and leverage additive quantization, an extreme lossy compression method invented for image descriptors, to compactly represent the parameters. The representations are then finetuned on task objectives to improve task accuracy. We conduct extensive experiments on MobileNet-v2, VGG-11, ResNet-50, Feature Pyramid Networks, and pruned DNNs trained for classification, detection, and segmentation tasks. The conceptually simple scheme consistently outperforms iterative unstructured pruning. Applied to ResNet-50 with 76.1 on the ILSVRC12 classification challenge, it achieves a 7.2× compression ratio with no accuracy loss and a 15.3× compression ratio at 74.79 accuracy. Further analyses suggest that representation sharing can frequently happen across network layers and that learning shared representations for an entire DNN can achieve better accuracy at the same compression ratio than compressing the model as multiple separate parts. We release PyTorch code to facilitate DNN deployment on resource-constrained devices and spur future research on efficient representations and storage of DNN parameters.


Learning Sparsity and Quantization Jointly and Automatically for Neural Network Compression via Constrained Optimization

Deep Neural Networks (DNNs) are widely applied in a wide range of usecas...

DeepN-JPEG: A Deep Neural Network Favorable JPEG-based Image Compression Framework

As one of most fascinating machine learning techniques, deep neural netw...

How Compact?: Assessing Compactness of Representations through Layer-Wise Pruning

Various forms of representations may arise in the many layers embedded i...

A Survey on Deep Neural Network Compression: Challenges, Overview, and Solutions

Deep Neural Network (DNN) has gained unprecedented performance due to it...

Utilizing Explainable AI for Quantization and Pruning of Deep Neural Networks

For many applications, utilizing DNNs (Deep Neural Networks) requires th...

L-GreCo: An Efficient and General Framework for Layerwise-Adaptive Gradient Compression

Data-parallel distributed training of deep neural networks (DNN) has gai...

Neural Network Module Decomposition and Recomposition

We propose a modularization method that decomposes a deep neural network...

Please sign up or login with your details

Forgot password? Click here to reset