Surprising properties of dropout in deep networks

02/14/2016
by   David P. Helmbold, et al.
0

We analyze dropout in deep networks with rectified linear units and the quadratic loss. Our results expose surprising differences between the behavior of dropout and more traditional regularizers like weight decay. For example, on some simple data sets dropout training produces negative weights even though the output is the sum of the inputs. This provides a counterpoint to the suggestion that dropout discourages co-adaptation of weights. We also show that the dropout penalty can grow exponentially in the depth of the network while the weight-decay penalty remains essentially linear, and that dropout is insensitive to various re-scalings of the input features, outputs, and network weights. This last insensitivity implies that there are no isolated local minima of the dropout training criterion. Our work uncovers new properties of dropout, extends our understanding of why dropout succeeds, and lays the foundation for further progress.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/15/2014

On the Inductive Bias of Dropout

Dropout is a simple but effective technique for learning in neural netwo...
research
09/28/2020

Quantal synaptic dilution enhances sparse encoding and dropout regularisation in deep networks

Dropout is a technique that silences the activity of units stochasticall...
research
12/21/2013

An empirical analysis of dropout in piecewise linear networks

The recently introduced dropout training criterion for neural networks h...
research
01/05/2021

AutoDropout: Learning Dropout Patterns to Regularize Deep Networks

Neural networks are often over-parameterized and hence benefit from aggr...
research
05/17/2022

Perturbation of Deep Autoencoder Weights for Model Compression and Classification of Tabular Data

Fully connected deep neural networks (DNN) often include redundant weigh...
research
11/17/2015

On the interplay of network structure and gradient convergence in deep learning

The regularization and output consistency behavior of dropout and layer-...
research
11/01/2019

Kinetic foundation of the zero-inflated negative binomial model for single-cell RNA sequencing data

Single-cell RNA sequencing data have complex features such as dropout ev...

Please sign up or login with your details

Forgot password? Click here to reset