Asymptotic convergence rate of Dropout on shallow linear neural networks

12/01/2020
by   Albert Senen-Cerda, et al.
0

We analyze the convergence rate of gradient flows on objective functions induced by Dropout and Dropconnect, when applying them to shallow linear Neural Networks (NNs) - which can also be viewed as doing matrix factorization using a particular regularizer. Dropout algorithms such as these are thus regularization techniques that use 0,1-valued random variables to filter weights during training in order to avoid coadaptation of features. By leveraging a recent result on nonconvex optimization and conducting a careful analysis of the set of minimizers as well as the Hessian of the loss function, we are able to obtain (i) a local convergence proof of the gradient flow and (ii) a bound on the convergence rate that depends on the data, the dropout probability, and the width of the NN. Finally, we compare this theoretical bound to numerical simulations, which are in qualitative agreement with the convergence bound and match it when starting sufficiently close to a minimizer.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/06/2020

Almost Sure Convergence of Dropout Algorithms for Neural Networks

We investigate the convergence and convergence rate of stochastic traini...
research
10/10/2017

An Analysis of Dropout for Matrix Factorization

Dropout is a simple yet effective algorithm for regularizing neural netw...
research
05/28/2019

On Dropout and Nuclear Norm Regularization

We give a formal and complete characterization of the explicit regulariz...
research
12/05/2021

On the Convergence of Shallow Neural Network Training with Randomly Masked Neurons

Given a dense shallow neural network, we focus on iteratively creating, ...
research
06/20/2020

Training (Overparametrized) Neural Networks in Near-Linear Time

The slow convergence rate and pathological curvature issues of first-ord...
research
11/17/2015

On the interplay of network structure and gradient convergence in deep learning

The regularization and output consistency behavior of dropout and layer-...
research
10/25/2022

Whitening Convergence Rate of Coupling-based Normalizing Flows

Coupling-based normalizing flows (e.g. RealNVP) are a popular family of ...

Please sign up or login with your details

Forgot password? Click here to reset