Dynamic Sparse Training: Find Efficient Sparse Network From Scratch With Trainable Masked Layers

05/14/2020
by   Junjie Liu, et al.
41

We present a novel network pruning algorithm called Dynamic Sparse Training that can jointly find the optimal network parameters and sparse network structure in a unified optimization process with trainable pruning thresholds. These thresholds can have fine-grained layer-wise adjustments dynamically via backpropagation. We demonstrate that our dynamic sparse training algorithm can easily train very sparse neural network models with little performance loss using the same number of training epochs as dense models. Dynamic Sparse Training achieves the state of the art performance compared with other sparse training algorithms on various network architectures. Additionally, we have several surprising observations that provide strong evidence for the effectiveness and efficiency of our algorithm. These observations reveal the underlying problems of traditional three-stage pruning algorithms and present the potential guidance provided by our algorithm to the design of more compact network architectures.

READ FULL TEXT

page 3

page 4

page 5

page 6

page 7

page 8

page 11

page 12

research
11/01/2021

Learning Pruned Structure and Weights Simultaneously from Scratch: an Attention based Approach

As a deep learning model typically contains millions of trainable weight...
research
02/12/2021

Dense for the Price of Sparse: Improved Performance of Sparsely Initialized Networks via a Subspace Offset

That neural networks may be pruned to high sparsities and retain high ac...
research
04/15/2022

End-to-End Sensitivity-Based Filter Pruning

In this paper, we present a novel sensitivity-based filter pruning algor...
research
10/10/2018

Pruning neural networks: is it time to nip it in the bud?

Pruning is a popular technique for compressing a neural network: a large...
research
06/21/2023

Fantastic Weights and How to Find Them: Where to Prune in Dynamic Sparse Training

Dynamic Sparse Training (DST) is a rapidly evolving area of research tha...
research
04/27/2018

CompNet: Neural networks growing via the compact network morphism

It is often the case that the performance of a neural network can be imp...
research
05/31/2023

Lottery Tickets in Evolutionary Optimization: On Sparse Backpropagation-Free Trainability

Is the lottery ticket phenomenon an idiosyncrasy of gradient-based train...

Please sign up or login with your details

Forgot password? Click here to reset