An Impossibility Result for High Dimensional Supervised Learning

01/29/2013
by   Mohammad Hossein Rohban, et al.
0

We study high-dimensional asymptotic performance limits of binary supervised classification problems where the class conditional densities are Gaussian with unknown means and covariances and the number of signal dimensions scales faster than the number of labeled training samples. We show that the Bayes error, namely the minimum attainable error probability with complete distributional knowledge and equally likely classes, can be arbitrarily close to zero and yet the limiting minimax error probability of every supervised learning algorithm is no better than a random coin toss. In contrast to related studies where the classification difficulty (Bayes error) is made to vanish, we hold it constant when taking high-dimensional limits. In contrast to VC-dimension based minimax lower bounds that consider the worst case error probability over all distributions that have a fixed Bayes error, our worst case is over the family of Gaussian distributions with constant Bayes error. We also show that a nontrivial asymptotic minimax error probability can only be attained for parametric subsets of zero measure (in a suitable measure space). These results expose the fundamental importance of prior knowledge and suggest that unless we impose strong structural constraints, such as sparsity, on the parametric space, supervised learning may be ineffective in high dimensional small sample settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2023

Efficient Learning of Minimax Risk Classifiers in High Dimensions

High-dimensional data is common in multiple areas, such as health care a...
research
06/07/2016

A Minimax Approach to Supervised Learning

Given a task of predicting Y from X, a loss function L, and a set of pro...
research
05/08/2016

Rate-Distortion Bounds on Bayes Risk in Supervised Learning

We present an information-theoretic framework for bounding the number of...
research
12/14/2018

Asymptotically Minimax Predictive Density for Sparse Count Data

Predictive density estimation under the Kullback--Leibler loss in high-d...
research
11/13/2018

Fundamental Limits of Exact Support Recovery in High Dimensions

We study the support recovery problem for a high-dimensional signal obse...
research
06/13/2018

Benchmarks for Image Classification and Other High-dimensional Pattern Recognition Problems

A good classification method should yield more accurate results than sim...
research
10/23/2019

Unifying Variational Inference and PAC-Bayes for Supervised Learning that Scales

Neural Network based controllers hold enormous potential to learn comple...

Please sign up or login with your details

Forgot password? Click here to reset