Estimation and Concentration of Missing Mass of Functions of Discrete Probability Distributions

10/05/2021
by   Prafulla Chandra, et al.
0

Given a positive function g from [0,1] to the reals, the function's missing mass in a sequence of iid samples, defined as the sum of g(pr(x)) over the missing letters x, is introduced and studied. The missing mass of a function generalizes the classical missing mass, and has several interesting connections to other related estimation problems. Minimax estimation is studied for order-α missing mass (g(p)=p^α) for both integer and non-integer values of α. Exact minimax convergence rates are obtained for the integer case. Concentration is studied for a class of functions and specific results are derived for order-α missing mass and missing Shannon entropy (g(p)=-plog p). Sub-Gaussian tail bounds with near-optimal worst-case variance factors are derived. Two new notions of concentration, named strongly sub-Gamma and filtered sub-Gaussian concentration, are introduced and shown to result in right tail bounds that are better than those obtained from sub-Gaussian concentration.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/19/2020

Revisiting Concentration of Missing Mass

We revisit the problem of missing mass concentration, deriving Bernstein...
research
03/20/2015

A Bennett Inequality for the Missing Mass

Novel concentration inequalities are obtained for the missing mass, i.e....
research
06/26/2023

Optimal estimation of high-order missing masses, and the rare-type match problem

Consider a random sample (X_1,…,X_n) from an unknown discrete distributi...
research
02/25/2014

Novel Deviation Bounds for Mixture of Independent Bernoulli Variables with Application to the Missing Mass

In this paper, we are concerned with obtaining distribution-free concent...
research
03/10/2015

Novel Bernstein-like Concentration Inequalities for the Missing Mass

We are concerned with obtaining novel concentration inequalities for the...
research
10/17/2021

Multifractal of mass function

Multifractal plays an important role in many fields. However, there is f...
research
02/27/2019

Consistent estimation of the missing mass for feature models

Feature models are popular in machine learning and they have been recent...

Please sign up or login with your details

Forgot password? Click here to reset