Implicit Data-Driven Regularization in Deep Neural Networks under SGD

11/26/2021
by   Xuran Meng, et al.
0

Much research effort has been devoted to explaining the success of deep learning. Random Matrix Theory (RMT) provides an emerging way to this end: spectral analysis of large random matrices involved in a trained deep neural network (DNN) such as weight matrices or Hessian matrices with respect to the stochastic gradient descent algorithm. In this paper, we conduct extensive experiments on weight matrices in different modules, e.g., layers, networks and data sets, to analyze the evolution of their spectra. We find that these spectra can be classified into three main types: Marčenko-Pastur spectrum (MP), Marčenko-Pastur spectrum with few bleeding outliers (MPB), and Heavy tailed spectrum (HT). Moreover, these discovered spectra are directly connected to the degree of regularization in the DNN. We argue that the degree of regularization depends on the quality of data fed to the DNN, namely Data-Driven Regularization. These findings are validated in several NNs, using Gaussian synthetic data and real data sets (MNIST and CIFAR10). Finally, we propose a spectral criterion and construct an early stopping procedure when the NN is found highly regularized without test data by using the connection between the spectra types and the degrees of regularization. Such early stopped DNNs avoid unnecessary extra training while preserving a much comparable generalization ability.

READ FULL TEXT
research
04/06/2023

Heavy-Tailed Regularization of Weight Matrices in Deep Neural Networks

Unraveling the reasons behind the remarkable success and exceptional gen...
research
01/24/2019

Traditional and Heavy-Tailed Self Regularization in Neural Network Models

Random Matrix Theory (RMT) is applied to analyze the weight matrices of ...
research
10/02/2018

Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning

Random Matrix Theory (RMT) is applied to analyze weight matrices of Deep...
research
11/26/2019

Spectra2pix: Generating Nanostructure Images from Spectra

The design of the nanostructures that are used in the field of nano-phot...
research
01/24/2019

Heavy-Tailed Universality Predicts Trends in Test Accuracies for Very Large Pre-Trained Deep Neural Networks

Given two or more Deep Neural Networks (DNNs) with the same or similar a...
research
08/01/2023

Regularization, early-stopping and dreaming: a Hopfield-like setup to address generalization and overfitting

In this work we approach attractor neural networks from a machine learni...
research
06/13/2020

Beyond Random Matrix Theory for Deep Networks

We investigate whether the Wigner semi-circle and Marcenko-Pastur distri...

Please sign up or login with your details

Forgot password? Click here to reset