Why Flatness Correlates With Generalization For Deep Neural Networks

03/10/2021
by   Shuofeng Zhang, et al.
0

The intuition that local flatness of the loss landscape is correlated with better generalization for deep neural networks (DNNs) has been explored for decades, spawning many different local flatness measures. Here we argue that these measures correlate with generalization because they are local approximations to a global property, the volume of the set of parameters mapping to a specific function. This global volume is equivalent to the Bayesian prior upon initialization. For functions that give zero error on a test set, it is directly proportional to the Bayesian posterior, making volume a more robust and theoretically better grounded predictor of generalization than flatness. Whilst flatness measures fail under parameter re-scaling, volume remains invariant and therefore continues to correlate well with generalization. Moreover, some variants of SGD can break the flatness-generalization correlation, while the volume-generalization correlation remains intact.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/29/2019

A Reparameterization-Invariant Flatness Measure for Deep Neural Networks

The performance of deep neural networks is often attributed to their aut...
research
12/02/2019

The intriguing role of module criticality in the generalization of deep networks

We study the phenomenon that some modules of deep neural networks (DNNs)...
research
01/03/2020

Feature-Robustness, Flatness and Generalization Error for Deep Neural Networks

The performance of deep neural networks is often attributed to their aut...
research
03/10/2021

Robustness to Pruning Predicts Generalization in Deep Neural Networks

Existing generalization measures that aim to capture a model's simplicit...
research
01/29/2020

The Case for Bayesian Deep Learning

The key distinguishing property of a Bayesian approach is marginalizatio...
research
05/25/2019

Global Minima of DNNs: The Plenty Pantry

A common strategy to train deep neural networks (DNNs) is to use very la...
research
06/23/2021

Minimum sharpness: Scale-invariant parameter-robustness of neural networks

Toward achieving robust and defensive neural networks, the robustness ag...

Please sign up or login with your details

Forgot password? Click here to reset