Entropy and mutual information in models of deep neural networks

05/24/2018
by   Marylou Gabrié, et al.
4

We examine a class of deep learning models with a tractable method to compute information-theoretic quantities. Our contributions are three-fold: (i) We show how entropies and mutual informations can be derived from heuristic statistical physics methods, under the assumption that weight matrices are independent and orthogonally-invariant. (ii) We extend particular cases in which this result is known to be rigorously exact by providing a proof for two-layers networks with Gaussian random weights, using the recently introduced adaptive interpolation method. (iii) We propose an experiment framework with generative models of synthetic datasets, on which we train deep neural networks with a weight constraint designed so that the assumption in (i) is verified during learning. We study the behavior of entropies and mutual informations throughout learning and conclude that, in the proposed setting, the relationship between compression and generalization remains elusive.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/10/2022

Information Flow in Deep Neural Networks

Although deep neural networks have been immensely successful, there is n...
research
02/25/2018

The Mutual Information in Random Linear Estimation Beyond i.i.d. Matrices

There has been definite progress recently in proving the variational sin...
research
01/16/2021

DeepMI: A Mutual Information Based Framework For Unsupervised Deep Learning of Tasks

In this work, we propose an information theory based framework DeepMI to...
research
02/08/2021

Mutual Information of Neural Network Initialisations: Mean Field Approximations

The ability to train randomly initialised deep neural networks is known ...
research
05/13/2018

Doing the impossible: Why neural networks can be trained at all

As deep neural networks grow in size, from thousands to millions to bill...
research
02/19/2019

Mutual Information for the Stochastic Block Model by the Adaptive Interpolation Method

We rigorously derive a single-letter variational expression for the mutu...
research
12/04/2022

Statistical Physics of Deep Neural Networks: Initialization toward Optimal Channels

In deep learning, neural networks serve as noisy channels between input ...

Please sign up or login with your details

Forgot password? Click here to reset