Random scattering of bits by prediction
We investigate a population of binary mistake sequences that result from learning with parametric models of different order. We obtain estimates of their error, algorithmic complexity and divergence from a purely random Bernoulli sequence. We study the relationship of these variables to the learner's information density parameter which is defined as the ratio between the lengths of the compressed to uncompressed files that contain the learner's decision rule. The results indicate that good learners have a low information densityρ while bad learners have a high ρ. Bad learners generate mistake sequences that are atypically complex or diverge stochastically from a purely random Bernoulli sequence. Good learners generate typically complex sequences with low divergence from Bernoulli sequences and they include mistake sequences generated by the Bayes optimal predictor. Based on the static algorithmic interference model of Ratsaby_entropy the learner here acts as a static structure which "scatters" the bits of an input sequence (to be predicted) in proportion to its information density ρ thereby deforming its randomness characteristics.
READ FULL TEXT