Automatic Sparse Connectivity Learning for Neural Networks

01/13/2022
by   Zhimin Tang, et al.
22

Since sparse neural networks usually contain many zero weights, these unnecessary network connections can potentially be eliminated without degrading network performance. Therefore, well-designed sparse neural networks have the potential to significantly reduce FLOPs and computational resources. In this work, we propose a new automatic pruning method - Sparse Connectivity Learning (SCL). Specifically, a weight is re-parameterized as an element-wise multiplication of a trainable weight variable and a binary mask. Thus, network connectivity is fully described by the binary mask, which is modulated by a unit step function. We theoretically prove the fundamental principle of using a straight-through estimator (STE) for network pruning. This principle is that the proxy gradients of STE should be positive, ensuring that mask variables converge at their minima. After finding Leaky ReLU, Softplus, and Identity STEs can satisfy this principle, we propose to adopt Identity STE in SCL for discrete mask relaxation. We find that mask gradients of different features are very unbalanced, hence, we propose to normalize mask gradients of each feature to optimize mask variable training. In order to automatically train sparse masks, we include the total number of network connections as a regularization term in our objective function. As SCL does not require pruning criteria or hyper-parameters defined by designers for network layers, the network is explored in a larger hypothesis space to achieve optimized sparse connectivity for the best performance. SCL overcomes the limitations of existing automatic pruning methods. Experimental results demonstrate that SCL can automatically learn and select important network connections for various baseline network structures. Deep learning models trained by SCL outperform the SOTA human-designed and automatic pruning methods in sparsity, accuracy, and FLOPs reduction.

READ FULL TEXT

page 1

page 9

page 15

page 16

research
04/18/2021

Lottery Jackpots Exist in Pre-trained Models

Network pruning is an effective approach to reduce network complexity wi...
research
09/30/2018

Pruned and Structurally Sparse Neural Networks

Advances in designing and training deep neural networks have led to the ...
research
11/14/2017

Deep Rewiring: Training very sparse deep networks

Neuromorphic hardware tends to pose limits on the connectivity of deep n...
research
07/06/2020

Bespoke vs. Prêt-à-Porter Lottery Tickets: Exploiting Mask Similarity for Trainable Sub-Network Finding

The observation of sparse trainable sub-networks within over-parametrize...
research
04/21/2023

Effective Neural Network L_0 Regularization With BinMask

L_0 regularization of neural networks is a fundamental problem. In addit...
research
02/18/2023

Calibrating the Rigged Lottery: Making All Tickets Reliable

Although sparse training has been successfully used in various resource-...
research
09/14/2022

Optimal Connectivity through Network Gradients for the Restricted Boltzmann Machine

Leveraging sparse networks to connect successive layers in deep neural n...

Please sign up or login with your details

Forgot password? Click here to reset