Optimisation Generalisation in Networks of Neurons

10/18/2022
by   Jeremy Bernstein, et al.
0

The goal of this thesis is to develop the optimisation and generalisation theoretic foundations of learning in artificial neural networks. On optimisation, a new theoretical framework is proposed for deriving architecture-dependent first-order optimisation algorithms. The approach works by combining a "functional majorisation" of the loss function with "architectural perturbation bounds" that encode an explicit dependence on neural architecture. The framework yields optimisation methods that transfer hyperparameters across learning problems. On generalisation, a new correspondence is proposed between ensembles of networks and individual networks. It is argued that, as network width and normalised margin are taken large, the space of networks that interpolate a particular training set concentrates on an aggregated Bayesian method known as a "Bayes point machine". This correspondence provides a route for transferring PAC-Bayesian generalisation theorems over to individual networks. More broadly, the correspondence presents a fresh perspective on the role of regularisation in networks with vastly more parameters than data.

READ FULL TEXT
research
04/14/2023

Wasserstein PAC-Bayes Learning: A Bridge Between Generalisation and Optimisation

PAC-Bayes learning is an established framework to assess the generalisat...
research
03/31/2021

Quantum Optimization for Training Quantum Neural Networks

Training quantum neural networks (QNNs) using gradient-based or gradient...
research
09/19/2018

Bayesian functional optimisation with shape prior

Real world experiments are expensive, and thus it is important to reach ...
research
06/12/2017

Practical Gauss-Newton Optimisation for Deep Learning

We present an efficient block-diagonal ap- proximation to the Gauss-Newt...
research
03/23/2021

PAC-Bayesian theory for stochastic LTI systems

In this paper we derive a PAC-Bayesian error bound for autonomous stocha...
research
05/06/2019

Fast and Reliable Architecture Selection for Convolutional Neural Networks

The performance of a Convolutional Neural Network (CNN) depends on its h...
research
04/21/2021

Automatic model training under restrictive time constraints

We develop a hyperparameter optimisation algorithm, Automated Budget Con...

Please sign up or login with your details

Forgot password? Click here to reset