The Convex Geometry of Backpropagation: Neural Network Gradient Flows Converge to Extreme Points of the Dual Convex Program

10/13/2021
by   Yifei Wang, et al.
25

We study non-convex subgradient flows for training two-layer ReLU neural networks from a convex geometry and duality perspective. We characterize the implicit bias of unregularized non-convex gradient flow as convex regularization of an equivalent convex model. We then show that the limit points of non-convex subgradient flows can be identified via primal-dual correspondence in this convex optimization problem. Moreover, we derive a sufficient condition on the dual variables which ensures that the stationary points of the non-convex objective are the KKT points of the convex objective, thus proving convergence of non-convex gradient flows to the global optimum. For a class of regular training data distributions such as orthogonal separable data, we show that this sufficient condition holds. Therefore, non-convex gradient flows in fact converge to optimal solutions of a convex optimization problem. We present numerical results verifying the predictions of our theory for non-convex subgradient descent.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2023

Optimal Sets and Solution Paths of ReLU Networks

We develop an analytical framework to characterize the set of optimal Re...
research
02/28/2023

Non-convex shape optimization by dissipative Hamiltonian flows

Shape optimization with constraints given by partial differential equati...
research
03/16/2023

Variational Principles for Mirror Descent and Mirror Langevin Dynamics

Mirror descent, introduced by Nemirovski and Yudin in the 1970s, is a pr...
research
06/24/2020

Provably Convergent Working Set Algorithm for Non-Convex Regularized Regression

Owing to their statistical properties, non-convex sparse regularizers ha...
research
06/26/2018

On Representer Theorems and Convex Regularization

We establish a general principle which states that regularizing an inver...
research
12/09/2020

Convex Regularization Behind Neural Reconstruction

Neural networks have shown tremendous potential for reconstructing high-...
research
01/09/2018

Convexification of Neural Graph

Traditionally, most complex intelligence architectures are extremely non...

Please sign up or login with your details

Forgot password? Click here to reset