A generalization gap estimation for overparameterized models via the Langevin functional variance

12/07/2021
by   Akifumi Okuno, et al.
0

This paper discusses the estimation of the generalization gap, the difference between a generalization error and an empirical error, for overparameterized models (e.g., neural networks). We first show that a functional variance, a key concept in defining a widely-applicable information criterion, characterizes the generalization gap even in overparameterized settings where a conventional theory cannot be applied. We also propose a computationally efficient approximation of the function variance, the Langevin approximation of the functional variance (Langevin FV). This method leverages only the 1st-order gradient of the squared loss function, without referencing the 2nd-order gradient; this ensures that the computation is efficient and the implementation is consistent with gradient-based optimization algorithms. We demonstrate the Langevin FV numerically by estimating the generalization gaps of overparameterized linear regression and non-linear neural network models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/19/2021

Non-linear Functional Modeling using Neural Networks

We introduce a new class of non-linear models for functional data based ...
research
08/25/2020

Channel-Directed Gradients for Optimization of Convolutional Neural Networks

We introduce optimization methods for convolutional neural networks that...
research
11/21/2018

Smoothed functional average variance estimation for dimension reduction

We propose an estimation method that we call functional average variance...
research
02/22/2022

Connecting Optimization and Generalization via Gradient Flow Path Length

Optimization and generalization are two essential aspects of machine lea...
research
06/18/2019

Information matrices and generalization

This work revisits the use of information criteria to characterize the g...
research
02/05/2021

Learning While Dissipating Information: Understanding the Generalization Capability of SGLD

Understanding the generalization capability of learning algorithms is at...
research
03/16/2023

Enabling First-Order Gradient-Based Learning for Equilibrium Computation in Markets

Understanding and analyzing markets is crucial, yet analytical equilibri...

Please sign up or login with your details

Forgot password? Click here to reset