A Theoretical-Empirical Approach to Estimating Sample Complexity of DNNs

05/05/2021
by   Devansh Bisla, et al.
2

This paper focuses on understanding how the generalization error scales with the amount of the training data for deep neural networks (DNNs). Existing techniques in statistical learning require computation of capacity measures, such as VC dimension, to provably bound this error. It is however unclear how to extend these measures to DNNs and therefore the existing analyses are applicable to simple neural networks, which are not used in practice, e.g., linear or shallow ones or otherwise multi-layer perceptrons. Moreover, many theoretical error bounds are not empirically verifiable. We derive estimates of the generalization error that hold for deep networks and do not rely on unattainable capacity measures. The enabling technique in our approach hinges on two major assumptions: i) the network achieves zero training error, ii) the probability of making an error on a test point is proportional to the distance between this point and its nearest training point in the feature space and at a certain maximal distance (that we call radius) it saturates. Based on these assumptions we estimate the generalization error of DNNs. The obtained estimate scales as O(1/(δN^1/d)), where N is the size of the training data and is parameterized by two quantities, the effective dimensionality of the data as perceived by the network (d) and the aforementioned radius (δ), both of which we find empirically. We show that our estimates match with the experimentally obtained behavior of the error on multiple learning tasks using benchmark data-sets and realistic models. Estimating training data requirements is essential for deployment of safety critical applications such as autonomous driving etc. Furthermore, collecting and annotating training data requires a huge amount of financial, computational and human resources. Our empirical estimates will help to efficiently allocate resources.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/21/2019

Neural Network Memorization Dissection

Deep neural networks (DNNs) can easily fit a random labeling of the trai...
research
02/04/2021

HYDRA: Hypergradient Data Relevance Analysis for Interpreting Deep Neural Networks

The behaviors of deep neural networks (DNNs) are notoriously resistant t...
research
11/27/2022

Adversarial Rademacher Complexity of Deep Neural Networks

Deep neural networks are vulnerable to adversarial attacks. Ideally, a r...
research
05/25/2019

Global Minima of DNNs: The Plenty Pantry

A common strategy to train deep neural networks (DNNs) is to use very la...
research
10/12/2016

Generalization bound for kernel similarity learning

Similarity learning has received a large amount of interest and is an im...
research
02/17/2018

An analysis of training and generalization errors in shallow and deep networks

An open problem around deep networks is the apparent absence of over-fit...
research
07/19/2022

Bounding generalization error with input compression: An empirical study with infinite-width networks

Estimating the Generalization Error (GE) of Deep Neural Networks (DNNs) ...

Please sign up or login with your details

Forgot password? Click here to reset