Over-parameterization Improves Generalization in the XOR Detection Problem

10/06/2018
by   Alon Brutzkus, et al.
0

Empirical evidence suggests that neural networks with ReLU activations generalize better with over-parameterization. However, there is currently no theoretical analysis that explains this observation. In this work, we study a simplified learning task with over-parameterized convolutional networks that empirically exhibits the same qualitative phenomenon. For this setting, we provide a theoretical analysis of the optimization and generalization performance of gradient descent. Specifically, we prove data-dependent sample complexity bounds which show that over-parameterization improves the generalization performance of gradient descent.

READ FULL TEXT
research
11/27/2019

How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?

A recent line of research on deep learning focuses on the extremely over...
research
01/07/2021

Towards Understanding Learning in Neural Networks with Linear Teachers

Can a neural network minimizing cross-entropy learn linearly separable d...
research
11/27/2019

Benefits of Jointly Training Autoencoders: An Improved Neural Tangent Kernel Analysis

A remarkable recent discovery in machine learning has been that deep neu...
research
06/05/2022

Demystifying the Global Convergence Puzzle of Learning Over-parameterized ReLU Nets in Very High Dimensions

This theoretical paper is devoted to developing a rigorous theory for de...
research
10/01/2020

Understanding the Role of Adversarial Regularization in Supervised Learning

Despite numerous attempts sought to provide empirical evidence of advers...
research
02/22/2020

On the Inductive Bias of a CNN for Orthogonal Patterns Distributions

Training overparameterized convolutional neural networks with gradient b...

Please sign up or login with your details

Forgot password? Click here to reset