Forward and Reverse Gradient-Based Hyperparameter Optimization

03/06/2017
by   Luca Franceschi, et al.
0

We study two procedures (reverse-mode and forward-mode) for computing the gradient of the validation error with respect to the hyperparameters of any iterative learning algorithm such as stochastic gradient descent. These procedures mirror two methods of computing gradients for recurrent neural networks and have different trade-offs in terms of running time and space requirements. Our formulation of the reverse-mode procedure is linked to previous work by Maclaurin et al. [2015] but does not require reversible dynamics. The forward-mode procedure is suitable for real-time hyperparameter updates, which may significantly speed up hyperparameter optimization on large datasets. We present experiments on data cleaning and on learning task interactions. We also present one large-scale experiment where the use of previous gradient-based methods would be prohibitive.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/11/2015

Gradient-based Hyperparameter Optimization through Reversible Learning

Tuning hyperparameters of learning algorithms is hard because gradients ...
research
09/29/2019

Gradient Descent: The Ultimate Optimizer

Working with any gradient-based machine learning algorithm involves the ...
research
10/10/2019

Probabilistic Rollouts for Learning Curve Extrapolation Across Hyperparameter Settings

We propose probabilistic models that can extrapolate learning curves of ...
research
06/11/2020

Optimizing generalization on the train set: a novel gradient-based framework to train parameters and hyperparameters simultaneously

Generalization is a central problem in Machine Learning. Most prediction...
research
01/05/2016

DrMAD: Distilling Reverse-Mode Automatic Differentiation for Optimizing Hyperparameters of Deep Neural Networks

The performance of deep neural networks is well-known to be sensitive to...
research
06/06/2023

Stochastic Marginal Likelihood Gradients using Neural Tangent Kernels

Selecting hyperparameters in deep learning greatly impacts its effective...
research
10/04/2015

Implicit stochastic approximation

The need to carry out parameter estimation from massive data has reinvig...

Please sign up or login with your details

Forgot password? Click here to reset