How much progress have we made in neural network training? A New Evaluation Protocol for Benchmarking Optimizers

by   Yuanhao Xiong, et al.

Many optimizers have been proposed for training deep neural networks, and they often have multiple hyperparameters, which make it tricky to benchmark their performance. In this work, we propose a new benchmarking protocol to evaluate both end-to-end efficiency (training a model from scratch without knowing the best hyperparameter) and data-addition training efficiency (the previously selected hyperparameters are used for periodically re-training the model with newly collected data). For end-to-end efficiency, unlike previous work that assumes random hyperparameter tuning, which over-emphasizes the tuning time, we propose to evaluate with a bandit hyperparameter tuning strategy. A human study is conducted to show that our evaluation protocol matches human tuning behavior better than the random search. For data-addition training, we propose a new protocol for assessing the hyperparameter sensitivity to data shift. We then apply the proposed benchmarking framework to 7 optimizers and various tasks, including computer vision, natural language processing, reinforcement learning, and graph mining. Our results show that there is no clear winner across all the tasks.


page 1

page 2

page 3

page 4


Surrogate Model Based Hyperparameter Tuning for Deep Learning with SPOT

A surrogate model based hyperparameter tuning approach for deep learning...

A Practical Bandit Method with Advantages in Neural Network Tuning

Stochastic bandit algorithms can be used for challenging non-convex opti...

Benchmarking Neural Network Training Algorithms

Training algorithms, broadly construed, are an essential part of every d...

Intrinsic uncertainties and where to find them

We introduce a framework for uncertainty estimation that both describes ...

On Noisy Evaluation in Federated Hyperparameter Tuning

Hyperparameter tuning is critical to the success of federated learning a...

An Empirical Evaluation Study on the Training of SDC Features for Dense Pixel Matching

Training a deep neural network is a non-trivial task. Not only the tunin...

Hyperparameter Selection for Subsampling Bootstraps

Massive data analysis becomes increasingly prevalent, subsampling method...

Please sign up or login with your details

Forgot password? Click here to reset