SNIP: Single-shot Network Pruning based on Connection Sensitivity

by   Namhoon Lee, et al.
University of Oxford

Pruning large neural networks while maintaining the performance is often highly desirable due to the reduced space and time complexity. In existing methods, pruning is incorporated within an iterative optimization procedure with either heuristically designed pruning schedules or additional hyperparameters, undermining their utility. In this work, we present a new approach that prunes a given network once at initialization. Specifically, we introduce a saliency criterion based on connection sensitivity that identifies structurally important connections in the network for the given task even before training. This eliminates the need for both pretraining as well as the complex pruning schedule while making it robust to architecture variations. After pruning, the sparse network is trained in the standard way. Our method obtains extremely sparse networks with virtually the same accuracy as the reference network on image classification tasks and is broadly applicable to various architectures including convolutional, residual and recurrent networks. Unlike existing methods, our approach enables us to demonstrate that the retained connections are indeed relevant to the given task.


page 9

page 10


Prospect Pruning: Finding Trainable Weights at Initialization using Meta-Gradients

Pruning neural networks at initialization would enable us to find sparse...

COPS: Controlled Pruning Before Training Starts

State-of-the-art deep neural network (DNN) pruning techniques, applied o...

Progressive Skeletonization: Trimming more fat from a network at initialization

Recent studies have shown that skeletonization (pruning parameters) of n...

N2NSkip: Learning Highly Sparse Networks using Neuron-to-Neuron Skip Connections

The over-parametrized nature of Deep Neural Networks leads to considerab...

SiPPing Neural Networks: Sensitivity-informed Provable Pruning of Neural Networks

We introduce a pruning algorithm that provably sparsifies the parameters...

On the Efficiency of K-Means Clustering: Evaluation, Optimization, and Algorithm Selection

This paper presents a thorough evaluation of the existing methods that a...

Bespoke vs. Prêt-à-Porter Lottery Tickets: Exploiting Mask Similarity for Trainable Sub-Network Finding

The observation of sparse trainable sub-networks within over-parametrize...

Please sign up or login with your details

Forgot password? Click here to reset