Most works on transformers trained with the Masked Language Modeling (ML...
In many practical scenarios – like hyperparameter search or continual
re...
Practitioners prune neural networks for efficiency gains and generalizat...
Methods for improving the efficiency of deep network training (i.e. the
...
Practitioners frequently observe that pruning improves model generalizat...
Modern deep learning involves training costly, highly overparameterized
...
Legal literature on machine learning (ML) tends to focus on harms, and a...
A striking observation about iterative magnitude pruning (IMP; Frankle e...
Benchmarking the tradeoff between neural network accuracy and training t...
AI's rapid growth has been felt acutely by scholarly venues, leading to
...
As datasets and models become increasingly large, distributed training h...
Studying neural network loss landscapes provides insights into the natur...
Magnitude pruning is a common, effective technique to identify sparse
su...
The computer vision world has been re-gaining enthusiasm in various
pre-...
We revisit and extend the experiments of Goodfellow et al. (2014), who s...
Self-supervised learning has recently begun to rival supervised learning...
Recent work has explored the possibility of pruning neural networks at
i...
In natural language processing (NLP), enormous pre-trained models like B...
We show that the error of magnitude-pruned networks follows a scaling la...
Neural network pruning—the task of reducing the size of a network by
rem...
Many neural network pruning algorithms proceed in three steps: train the...
Batch normalization (BatchNorm) has become an indispensable tool for tra...
Recent studies have shown that many important aspects of neural network
...
We introduce "instability analysis," a framework for assessing whether t...
Pruning is a standard technique for removing unnecessary structure from ...
Recent work on the "lottery ticket hypothesis" proposes that
randomly-in...
Neural network compression techniques are able to reduce the parameter c...
Recent work on neural network pruning indicates that, at training time,
...