Convergence of Deep Neural Networks to a Hierarchical Covariance Matrix Decomposition

03/14/2017
by   Nima Dehmamy, et al.
0

We show that in a deep neural network trained with ReLU, the low-lying layers should be replaceable with truncated linearly activated layers. We derive the gradient descent equations in this truncated linear model and demonstrate that --if the distribution of the training data is stationary during training-- the optimal choice for weights in these low-lying layers is the eigenvectors of the covariance matrix of the data. If the training data is random and uniform enough, these eigenvectors can be found using a small fraction of the training data, thus reducing the computational complexity of training. We show how this can be done recursively to form successive, trained layers. At least in the first layer, our tests show that this approach improves classification of images while reducing network size.

READ FULL TEXT

page 6

page 7

research
03/02/2020

On the Global Convergence of Training Deep Linear ResNets

We study the convergence of gradient descent (GD) and stochastic gradien...
research
12/28/2020

Low-Cost Maximum Entropy Covariance Matrix Reconstruction Algorithm for Robust Adaptive Beamforming

In this letter, we present a novel low-complexity adaptive beamforming t...
research
11/12/2019

Eternal Sunshine of the Spotless Net: Selective Forgetting in Deep Networks

We explore the problem of selectively forgetting a particular set of dat...
research
08/01/2019

Low-Rank plus Sparse Decomposition of Covariance Matrices using Neural Network Parametrization

This paper revisits the problem of decomposing a positive semidefinite m...
research
06/06/2022

The Neural Covariance SDE: Shaped Infinite Depth-and-Width Networks at Initialization

The logit outputs of a feedforward neural network at initialization are ...
research
10/03/2022

Limitations of neural network training due to numerical instability of backpropagation

We study the training of deep neural networks by gradient descent where ...
research
10/02/2018

Simultaneously Optimizing Weight and Quantizer of Ternary Neural Network using Truncated Gaussian Approximation

In the past years, Deep convolution neural network has achieved great su...

Please sign up or login with your details

Forgot password? Click here to reset