A Combinatorial Perspective on the Optimization of Shallow ReLU Networks

10/01/2022
by   Michael Matena, et al.
0

The NP-hard problem of optimizing a shallow ReLU network can be characterized as a combinatorial search over each training example's activation pattern followed by a constrained convex problem given a fixed set of activation patterns. We explore the implications of this combinatorial aspect of ReLU optimization in this work. We show that it can be naturally modeled via a geometric and combinatoric object known as a zonotope with its vertex set isomorphic to the set of feasible activation patterns. This assists in analysis and provides a foundation for further research. We demonstrate its usefulness when we explore the sensitivity of the optimal loss to perturbations of the training data. Later we discuss methods of zonotope vertex selection and its relevance to optimization. Overparameterization assists in training by making a randomly chosen vertex more likely to contain a good solution. We then introduce a novel polynomial-time vertex selection procedure that provably picks a vertex containing the global optimum using only double the minimum number of parameters required to fit the data. We further introduce a local greedy search heuristic over zonotope vertices and demonstrate that it outperforms gradient descent on underparameterized problems.

READ FULL TEXT

page 4

page 16

research
09/27/2018

Complexity of Training ReLU Neural Network

In this paper, we explore some basic questions on the complexity of trai...
research
02/26/2017

Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs

Deep learning models are often successfully trained using gradient desce...
research
08/03/2022

Gradient descent provably escapes saddle points in the training of shallow ReLU networks

Dynamical systems theory has recently been applied in optimization to pr...
research
06/05/2023

Does a sparse ReLU network training problem always admit an optimum?

Given a training set, a loss function, and a neural network architecture...
research
07/16/2017

Theoretical insights into the optimization landscape of over-parameterized shallow neural networks

In this paper we study the problem of learning a shallow artificial neur...
research
11/04/2019

Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals

We consider the problem of computing the best-fitting ReLU with respect ...
research
05/06/2021

The layer-wise L1 Loss Landscape of Neural Nets is more complex around local minima

For fixed training data and network parameters in the other layers the L...

Please sign up or login with your details

Forgot password? Click here to reset