Who breaks early, looses: goal oriented training of deep neural networks based on port Hamiltonian dynamics

04/14/2023
by   Julian Burghoff, et al.
0

The highly structured energy landscape of the loss as a function of parameters for deep neural networks makes it necessary to use sophisticated optimization strategies in order to discover (local) minima that guarantee reasonable performance. Overcoming less suitable local minima is an important prerequisite and often momentum methods are employed to achieve this. As in other non local optimization procedures, this however creates the necessity to balance between exploration and exploitation. In this work, we suggest an event based control mechanism for switching from exploration to exploitation based on reaching a predefined reduction of the loss function. As we give the momentum method a port Hamiltonian interpretation, we apply the 'heavy ball with friction' interpretation and trigger breaking (or friction) when achieving certain goals. We benchmark our method against standard stochastic gradient descent and provide experimental evidence for improved performance of deep neural networks when our strategy is applied.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/01/2018

Geometry of energy landscapes and the optimizability of deep neural networks

Deep neural networks are workhorse models in machine learning with multi...
research
08/10/2018

On the Convergence of Weighted AdaGrad with Momentum for Training Deep Neural Networks

Adaptive stochastic gradient descent methods, such as AdaGrad, RMSProp, ...
research
12/16/2020

Data optimization for large batch distributed training of deep neural networks

Distributed training in deep learning (DL) is common practice as data an...
research
02/14/2018

Toward Deeper Understanding of Nonconvex Stochastic Optimization with Momentum using Diffusion Approximations

Momentum Stochastic Gradient Descent (MSGD) algorithm has been widely ap...
research
10/05/2020

Improved Analysis of Clipping Algorithms for Non-convex Optimization

Gradient clipping is commonly used in training deep neural networks part...
research
06/10/2019

Analysis Of Momentum Methods

Gradient decent-based optimization methods underpin the parameter traini...
research
12/31/2019

Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity

Traditional landscape analysis of deep neural networks aims to show that...

Please sign up or login with your details

Forgot password? Click here to reset