Bayesian neural networks increasingly sparsify their units with depth

10/11/2018
by   Mariia Vladimirova, et al.
0

We investigate deep Bayesian neural networks with Gaussian priors on the weights and ReLU-like nonlinearities, shedding light on novel sparsity-inducing mechanisms at the level of the units of the network, both pre- and post-nonlinearities. The main thrust of the paper is to establish that the units prior distribution becomes increasingly heavy-tailed with depth. We show that first layer units are Gaussian, second layer units are sub-Exponential, and we introduce sub-Weibull distributions to characterize the deeper layers units. Bayesian neural networks with Gaussian priors are well known to induce the weight decay penalty on the weights. In contrast, our result indicates a more elaborate regularisation scheme at the level of the units, ranging from convex penalties for the first two layers - weight decay for the first and Lasso for the second - to non convex penalties for deeper layers. Thus, despite weight decay does not allow for the weights to be set exactly to zero, sparse solutions tend to be selected for the units from the second layer onward. This result provides new theoretical insight on deep Bayesian neural networks, underpinning their natural shrinkage properties and practical potential.

READ FULL TEXT
research
04/23/2021

Exact priors of finite neural networks

Bayesian neural networks are theoretically well-understood only in the i...
research
10/06/2021

Bayesian neural network unit priors and generalized Weibull-tail property

The connection between Bayesian neural networks and Gaussian processes g...
research
08/02/2019

Network with Sub-Networks

We introduce network with sub-network, a neural network which their weig...
research
05/27/2019

Equivalent and Approximate Transformations of Deep Neural Networks

Two networks are equivalent if they produce the same output for any give...
research
02/10/2022

Exact Solutions of a Deep Linear Network

This work finds the exact solutions to a deep linear network with weight...
research
10/01/2018

Probabilistic Meta-Representations Of Neural Networks

Existing Bayesian treatments of neural networks are typically characteri...
research
12/29/2022

Bayesian Interpolation with Deep Linear Networks

This article concerns Bayesian inference using deep linear networks with...

Please sign up or login with your details

Forgot password? Click here to reset