Taming neural networks with TUSLA: Non-convex learning via adaptive stochastic gradient Langevin algorithms

06/25/2020
by   Attila Lovas, et al.
0

Artificial neural networks (ANNs) are typically highly nonlinear systems which are finely tuned via the optimization of their associated, non-convex loss functions. Typically, the gradient of any such loss function fails to be dissipative making the use of widely-accepted (stochastic) gradient descent methods problematic. We offer a new learning algorithm based on an appropriately constructed variant of the popular stochastic gradient Langevin dynamics (SGLD), which is called tamed unadjusted stochastic Langevin algorithm (TUSLA). We also provide a nonasymptotic analysis of the new algorithm's convergence properties in the context of non-convex learning problems with the use of ANNs. Thus, we provide finite-time guarantees for TUSLA to find approximate minimizers of both empirical and population risks. The roots of the TUSLA algorithm are based on the taming technology for diffusion processes with superlinear coefficients as developed in Sabanis (2013, 2016) and for MCMC algorithms in Brosse et al. (2019). Numerical experiments are presented which confirm the theoretical findings and illustrate the need for the use of the new algorithm in comparison to vanilla SGLD within the framework of ANNs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/13/2017

Non-convex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis

Stochastic Gradient Langevin Dynamics (SGLD) is a popular variant of Sto...
research
04/06/2020

Non-Convex Stochastic Optimization via Non-Reversible Stochastic Gradient Langevin Dynamics

Stochastic gradient Langevin dynamics (SGLD) is a poweful algorithm for ...
research
12/09/2005

Evolving Stochastic Learning Algorithm Based on Tsallis Entropic Index

In this paper, inspired from our previous algorithm, which was based on ...
research
11/19/2015

Online Batch Selection for Faster Training of Neural Networks

Deep neural networks are commonly trained using stochastic non-convex op...
research
07/19/2021

Non-asymptotic estimates for TUSLA algorithm for non-convex learning with applications to neural networks with ReLU activation function

We consider non-convex stochastic optimization problems where the object...
research
09/19/2022

On the Theoretical Properties of Noise Correlation in Stochastic Optimization

Studying the properties of stochastic noise to optimize complex non-conv...
research
03/11/2021

BODAME: Bilevel Optimization for Defense Against Model Extraction

Model extraction attacks have become serious issues for service provider...

Please sign up or login with your details

Forgot password? Click here to reset