Fast calculation of entropy with Zhang's estimator

07/26/2017
by   Antoni Lozano, et al.
0

Entropy is a fundamental property of a repertoire. Here, we present an efficient algorithm to estimate the entropy of types with the help of Zhang's estimator. The algorithm takes advantage of the fact that the number of different frequencies in a text is in general much smaller than the number of types. We justify the convenience of the algorithm by means of an analysis of the statistical properties of texts from more than 1000 languages. Our work opens up various possibilities for future research.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/14/2016

Quantitative Entropy Study of Language Complexity

We study the entropy of Chinese and English texts, based on characters i...
research
07/11/2017

On the letter frequencies and entropy of written Marathi

We carry out a comprehensive analysis of letter frequencies in contempor...
research
03/02/2019

A Bayesian Nonparametric Estimation to Entropy

A Bayesian nonparametric estimator to entropy is proposed. The derivatio...
research
12/17/2021

Generalized LRS Estimator for Min-entropy Estimation

The min-entropy is a widely used metric to quantify the randomness of ge...
research
11/18/2016

Statistical Properties of European Languages and Voynich Manuscript Analysis

The statistical properties of letters frequencies in European literature...
research
04/18/2016

Efficient Calculation of Bigram Frequencies in a Corpus of Short Texts

We show that an efficient and popular method for calculating bigram freq...
research
05/08/2021

Understanding Neural Networks with Logarithm Determinant Entropy Estimator

Understanding the informative behaviour of deep neural networks is chall...

Please sign up or login with your details

Forgot password? Click here to reset