MinMax Networks

06/15/2023
by   Winfried Lohmiller, et al.
0

While much progress has been achieved over the last decades in neuro-inspired machine learning, there are still fundamental theoretical problems in gradient-based learning using combinations of neurons. These problems, such as saddle points and suboptimal plateaus of the cost function, can lead in theory and practice to failures of learning. In addition, the discrete step size selection of the gradient is problematic since too large steps can lead to instability and too small steps slow down the learning. This paper describes an alternative discrete MinMax learning approach for continuous piece-wise linear functions. Global exponential convergence of the algorithm is established using Contraction Theory with Inequality Constraints, which is extended from the continuous to the discrete case in this paper: The parametrization of each linear function piece is, in contrast to deep learning, linear in the proposed MinMax network. This allows a linear regression stability proof as long as measurements do not transit from one linear region to its neighbouring linear region. The step size of the discrete gradient descent is Lagrangian limited orthogonal to the edge of two neighbouring linear functions. It will be shown that this Lagrangian step limitation does not decrease the convergence of the unconstrained system dynamics in contrast to a step size limitation in the direction of the gradient. We show that the convergence rate of a constrained piece-wise linear function learning is equivalent to the exponential convergence rates of the individual local linear regions.

READ FULL TEXT

page 4

page 16

research
01/15/2020

Theoretical Interpretation of Learned Step Size in Deep-Unfolded Gradient Descent

Deep unfolding is a promising deep-learning technique in which an iterat...
research
04/01/2022

Learning to Accelerate by the Methods of Step-size Planning

Gradient descent is slow to converge for ill-conditioned problems and no...
research
02/14/2020

Active set expansion strategies in MPRGP algorithm

The paper investigates strategies for expansion of active set that can b...
research
08/05/2019

Extending the step-size restriction for gradient descent to avoid strict saddle points

We provide larger step-size restrictions for which gradient descent base...
research
05/22/2018

Step Size Matters in Deep Learning

Training a neural network with the gradient descent algorithm gives rise...
research
10/26/2017

Maximum Principle Based Algorithms for Deep Learning

The continuous dynamical system approach to deep learning is explored in...
research
06/02/2021

q-RBFNN:A Quantum Calculus-based RBF Neural Network

In this research a novel stochastic gradient descent based learning appr...

Please sign up or login with your details

Forgot password? Click here to reset