A Bayesian Perspective on Training Speed and Model Selection

10/27/2020
by   Clare Lyle, et al.
0

We take a Bayesian perspective to illustrate a connection between training speed and the marginal likelihood in linear models. This provides two major insights: first, that a measure of a model's training speed can be used to estimate its marginal likelihood. Second, that this measure, under certain conditions, predicts the relative weighting of models in linear model combinations trained to minimize a regression loss. We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks. We further provide encouraging empirical evidence that the intuition developed in these settings also holds for deep neural networks trained with stochastic gradient descent. Our results suggest a promising new direction towards explaining why neural networks trained with stochastic gradient descent are biased towards functions that generalize well.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/31/2022

A subsampling approach for Bayesian model selection

It is common practice to use Laplace approximations to compute marginal ...
research
12/05/2019

Neural Tangents: Fast and Easy Infinite Neural Networks in Python

Neural Tangents is a library designed to enable research into infinite-w...
research
04/15/2019

The Impact of Neural Network Overparameterization on Gradient Confusion and Stochastic Gradient Descent

The goal of this paper is to study why stochastic gradient descent (SGD)...
research
02/18/2019

Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent

A longstanding goal in deep learning research has been to precisely char...
research
08/04/2023

A stochastic optimization approach to train non-linear neural networks with a higher-order variation regularization

While highly expressive parametric models including deep neural networks...
research
06/25/2020

The Gaussian equivalence of generative models for learning with two-layer neural networks

Understanding the impact of data structure on learning in neural network...
research
05/22/2018

Parsimonious Bayesian deep networks

Combining Bayesian nonparametrics and a forward model selection strategy...

Please sign up or login with your details

Forgot password? Click here to reset