Minimum Description Length codes are critical

09/03/2018
by   Ryan Cubero, et al.
0

Learning from the data, in Minimum Description Length (MDL), is equivalent to an optimal coding problem. We show that the codes that achieve optimal compression in MDL are critical in a very precise sense. First, when they are taken as generative models of samples, they generate samples with broad empirical distributions and with an high value of the relevance, defined as the entropy of the empirical frequencies. These results are derived for different statistical models (Dirichlet model, independent and pairwise dependent spin models, and restricted Boltzmann machines). Second, MDL codes sit precisely at a second order phase transition point where the symmetry between the sampled outcomes is spontaneously broken. The order parameter controlling the phase transition is the coding cost of the samples. The phase transition is a manifestation of the optimality of MDL codes, and it arises because codes that achieve a higher compression do not exist. These results suggest a clear interpretation of the widespread occurrence of statistical criticality as a characterization of samples which are maximally informative on the underlying generative process.

READ FULL TEXT
research
02/01/2022

Quantifying Relevance in Learning and Inference

Learning is a distinctive feature of intelligent behaviour. High-through...
research
03/19/2019

On the weight distribution of second order Reed-Muller codes and their relatives

The weight distribution of second order q-ary Reed-Muller codes have bee...
research
04/16/2018

The Subfield Codes of Ovoid Codes

Ovoids in (3, (q)) have been an interesting topic in coding theory, comb...
research
01/29/2018

Almost Optimal Scaling of Reed-Muller Codes on BEC and BSC Channels

Consider a binary linear code of length N, minimum distance d_min, trans...
research
05/21/2019

Compression with Flows via Local Bits-Back Coding

Likelihood-based generative models are the backbones of lossless compres...
research
05/21/2020

An Importance Aware Weighted Coding Theorem Using Message Importance Measure

There are numerous scenarios in source coding where not only the code le...
research
07/17/2019

Learnability for the Information Bottleneck

The Information Bottleneck (IB) method (tishby2000information) provides ...

Please sign up or login with your details

Forgot password? Click here to reset