Mise en abyme with artificial intelligence: how to predict the accuracy of NN, applied to hyper-parameter tuning

06/28/2019
by   Giorgia Franchini, et al.
0

In the context of deep learning, the costliest phase from a computational point of view is the full training of the learning algorithm. However, this process is to be used a significant number of times during the design of a new artificial neural network, leading therefore to extremely expensive operations. Here, we propose a low-cost strategy to predict the accuracy of the algorithm, based only on its initial behaviour. To do so, we train the network of interest up to convergence several times, modifying its characteristics at each training. The initial and final accuracies observed during this beforehand process are stored in a database. We then make use of both curve fitting and Support Vector Machines techniques, the latter being trained on the created database, to predict the accuracy of the network, given its accuracy on the primary iterations of its learning. This approach can be of particular interest when the space of the characteristics of the network is notably large or when its full training is highly time-consuming. The results we obtained are promising and encouraged us to apply this strategy to a topical issue: hyper-parameter optimisation (HO). In particular, we focused on the HO of a convolutional neural network for the classification of the databases MNIST and CIFAR-10. By using our method of prediction, and an algorithm implemented by us for a probabilistic exploration of the hyper-parameter space, we were able to find the hyper-parameter settings corresponding to the optimal accuracies already known in literature, at a quite low-cost.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/09/2022

Hyper-Parameter Auto-Tuning for Sparse Bayesian Learning

Choosing the values of hyper-parameters in sparse Bayesian learning (SBL...
research
06/23/2023

Predicting Grokking Long Before it Happens: A look into the loss landscape of models which grok

This paper focuses on predicting the occurrence of grokking in neural ne...
research
12/28/2019

A Genetic Algorithm based Kernel-size Selection Approach for a Multi-column Convolutional Neural Network

Deep neural network-based architectures give promising results in variou...
research
10/24/2018

Noisy Blackbox Optimization with Multi-Fidelity Queries: A Tree Search Approach

We study the problem of black-box optimization of a noisy function in th...
research
11/07/2016

Neural Networks Designing Neural Networks: Multi-Objective Hyper-Parameter Optimization

Artificial neural networks have gone through a recent rise in popularity...
research
04/13/2021

Bayesian Optimisation for a Biologically Inspired Population Neural Network

We have used Bayesian Optimisation (BO) to find hyper-parameters in an e...
research
02/14/2021

New methods for metastimuli: architecture, embeddings, and neural network optimization

Six significant new methodological developments of the previously-presen...

Please sign up or login with your details

Forgot password? Click here to reset