Painless step size adaptation for SGD

02/01/2021
by   Ilona Kulikovskikh, et al.
0

Convergence and generalization are two crucial aspects of performance in neural networks. When analyzed separately, these properties may lead to contradictory results. Optimizing a convergence rate yields fast training, but does not guarantee the best generalization error. To avoid the conflict, recent studies suggest adopting a moderately large step size for optimizers, but the added value on the performance remains unclear. We propose the LIGHT function with the four configurations which regulate explicitly an improvement in convergence and generalization on testing. This contribution allows to: 1) improve both convergence and generalization of neural networks with no need to guarantee their stability; 2) build more reliable and explainable network architectures with no need for overparameterization. We refer to it as "painless" step size adaptation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/19/2020

On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems

This paper analyzes the trajectories of stochastic gradient descent (SGD...
research
11/14/2017

A Robust Variable Step Size Fractional Least Mean Square (RVSS-FLMS) Algorithm

In this paper, we propose an adaptive framework for the variable step si...
research
12/10/2018

Why Does Stagewise Training Accelerate Convergence of Testing Error Over SGD?

Stagewise training strategy is commonly used for learning neural network...
research
03/02/2021

Convergence Rate of the (1+1)-Evolution Strategy with Success-Based Step-Size Adaptation on Convex Quadratic Functions

The (1+1)-evolution strategy (ES) with success-based step-size adaptatio...
research
12/01/2012

Cumulative Step-size Adaptation on Linear Functions

The CSA-ES is an Evolution Strategy with Cumulative Step size Adaptation...
research
06/07/2021

Heavy Tails in SGD and Compressibility of Overparametrized Neural Networks

Neural network compression techniques have become increasingly popular a...
research
02/13/2019

Uniform convergence may be unable to explain generalization in deep learning

We cast doubt on the power of uniform convergence-based generalization b...

Please sign up or login with your details

Forgot password? Click here to reset