Long-Tail Theory under Gaussian Mixtures

07/20/2023
by   Arman Bolatov, et al.
0

We suggest a simple Gaussian mixture model for data generation that complies with Feldman's long tail theory (2020). We demonstrate that a linear classifier cannot decrease the generalization error below a certain level in the proposed model, whereas a nonlinear classifier with a memorization capacity can. This confirms that for long-tailed distributions, rare training examples must be considered for optimal generalization to new data. Finally, we show that the performance gap between linear and nonlinear models can be lessened as the tail becomes shorter in the subpopulation frequency distribution, as confirmed by experiments on synthetic and real data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/16/2022

Fat-Tailed Variational Inference with Anisotropic Tail Adaptive Flows

While fat-tailed densities commonly arise as posterior and marginal dist...
research
01/16/2022

GradTail: Learning Long-Tailed Data Using Gradient-based Sample Weighting

We propose GradTail, an algorithm that uses gradients to improve model p...
research
08/09/2020

What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation

Deep learning algorithms are well-known to have a propensity for fitting...
research
12/15/2021

Mining Minority-class Examples With Uncertainty Estimates

In the real world, the frequency of occurrence of objects is naturally s...
research
10/13/2022

Benchmarking Long-tail Generalization with Likelihood Splits

In order to reliably process natural language, NLP systems must generali...
research
06/12/2019

Does Learning Require Memorization? A Short Tale about a Long Tail

State-of-the-art results on image recognition tasks are achieved using o...
research
04/27/2022

ELM: Embedding and Logit Margins for Long-Tail Learning

Long-tail learning is the problem of learning under skewed label distrib...

Please sign up or login with your details

Forgot password? Click here to reset