Rare Gems: Finding Lottery Tickets at Initialization

by   Kartik Sreenivasan, et al.

It has been widely observed that large neural networks can be pruned to a small fraction of their original size, with little loss in accuracy, by typically following a time-consuming "train, prune, re-train" approach. Frankle Carbin (2018) conjecture that we can avoid this by training lottery tickets, i.e., special sparse subnetworks found at initialization, that can be trained to high accuracy. However, a subsequent line of work presents concrete evidence that current algorithms for finding trainable networks at initialization, fail simple baseline comparisons, e.g., against training random sparse subnetworks. Finding lottery tickets that train to better accuracy compared to simple baselines remains an open problem. In this work, we partially resolve this open problem by discovering rare gems: subnetworks at initialization that attain considerable accuracy, even before training. Refining these rare gems - "by means of fine-tuning" - beats current baselines and leads to accuracy competitive or better than magnitude pruning methods.


page 1

page 2

page 3

page 4


Pruning at Initialization – A Sketching Perspective

The lottery ticket hypothesis (LTH) has increased attention to pruning n...

The Lottery Ticket Hypothesis: Finding Small, Trainable Neural Networks

Neural network compression techniques are able to reduce the parameter c...

Prospect Pruning: Finding Trainable Weights at Initialization using Meta-Gradients

Pruning neural networks at initialization would enable us to find sparse...

Dense for the Price of Sparse: Improved Performance of Sparsely Initialized Networks via a Subspace Offset

That neural networks may be pruned to high sparsities and retain high ac...

Lottery Tickets on a Data Diet: Finding Initializations with Sparse Trainable Networks

A striking observation about iterative magnitude pruning (IMP; Frankle e...

[Reproducibility Report] Rigging the Lottery: Making All Tickets Winners

RigL, a sparse training algorithm, claims to directly train sparse netwo...

Please sign up or login with your details

Forgot password? Click here to reset