Large Scale Structure of Neural Network Loss Landscapes

06/11/2019
by   Stanislav Fort, et al.
0

There are many surprising and perhaps counter-intuitive properties of optimization of deep neural networks. We propose and experimentally verify a unified phenomenological model of the loss landscape that incorporates many of them. High dimensionality plays a key role in our model. Our core idea is to model the loss landscape as a set of high dimensional wedges that together form a large-scale, inter-connected structure and towards which optimization is drawn. We first show that hyperparameter choices such as learning rate, network width and L_2 regularization, affect the path optimizer takes through the landscape in a similar ways, influencing the large scale curvature of the regions the optimizer explores. Finally, we predict and demonstrate new counter-intuitive properties of the loss-landscape. We show an existence of low loss subspaces connecting a set (not only a pair) of solutions, and verify it experimentally. Finally, we analyze recently popular ensembling techniques for deep networks in the light of our model.

READ FULL TEXT
research
07/22/2023

The instabilities of large learning rate training: a loss landscape view

Modern neural networks are undeniably successful. Numerous works study h...
research
09/25/2018

The jamming transition as a paradigm to understand the loss landscape of deep neural networks

Deep learning has been immensely successful at a variety of tasks, rangi...
research
12/02/2020

Neural Teleportation

In this paper, we explore a process called neural teleportation, a mathe...
research
06/30/2021

What can linear interpolation of neural network loss landscapes tell us?

Studying neural network loss landscapes provides insights into the natur...
research
12/28/2017

Visualizing the Loss Landscape of Neural Nets

Neural network training relies on our ability to find "good" minimizers ...
research
06/22/2020

On the alpha-loss Landscape in the Logistic Model

We analyze the optimization landscape of a recently introduced tunable c...
research
12/16/2019

A Deep Neural Network's Loss Surface Contains Every Low-dimensional Pattern

The work "Loss Landscape Sightseeing with Multi-Point Optimization" (Sko...

Please sign up or login with your details

Forgot password? Click here to reset