Prospect Pruning: Finding Trainable Weights at Initialization using Meta-Gradients

by   Milad Alizadeh, et al.

Pruning neural networks at initialization would enable us to find sparse models that retain the accuracy of the original network while consuming fewer computational resources for training and inference. However, current methods are insufficient to enable this optimization and lead to a large degradation in model performance. In this paper, we identify a fundamental limitation in the formulation of current methods, namely that their saliency criteria look at a single step at the start of training without taking into account the trainability of the network. While pruning iteratively and gradually has been shown to improve pruning performance, explicit consideration of the training stage that will immediately follow pruning has so far been absent from the computation of the saliency criterion. To overcome the short-sightedness of existing methods, we propose Prospect Pruning (ProsPr), which uses meta-gradients through the first few steps of optimization to determine which weights to prune. ProsPr combines an estimate of the higher-order effects of pruning on the loss and the optimization trajectory to identify the trainable sub-network. Our method achieves state-of-the-art pruning performance on a variety of vision classification tasks, with less data and in a single shot compared to existing pruning-at-initialization methods.


page 1

page 2

page 3

page 4


SNIP: Single-shot Network Pruning based on Connection Sensitivity

Pruning large neural networks while maintaining the performance is often...

Cyclical Pruning for Sparse Neural Networks

Current methods for pruning neural network weights iteratively apply mag...

Pruning neural networks without any data by iteratively conserving synaptic flow

Pruning the parameters of deep neural networks has generated intense int...

Dimensionality Reduced Training by Pruning and Freezing Parts of a Deep Neural Network, a Survey

State-of-the-art deep learning models have a parameter count that reache...

Rare Gems: Finding Lottery Tickets at Initialization

It has been widely observed that large neural networks can be pruned to ...

What to Prune and What Not to Prune at Initialization

Post-training dropout based approaches achieve high sparsity and are wel...

One-shot Network Pruning at Initialization with Discriminative Image Patches

One-shot Network Pruning at Initialization (OPaI) is an effective method...

Please sign up or login with your details

Forgot password? Click here to reset