Unsupervised Learning of Initialization in Deep Neural Networks via Maximum Mean Discrepancy

02/08/2023
by   Cheolhyoung Lee, et al.
0

Despite the recent success of stochastic gradient descent in deep learning, it is often difficult to train a deep neural network with an inappropriate choice of its initial parameters. Even if training is successful, it has been known that the initial parameter configuration may negatively impact generalization. In this paper, we propose an unsupervised algorithm to find good initialization for input data, given that a downstream task is d-way classification. We first notice that each parameter configuration in the parameter space corresponds to one particular downstream task of d-way classification. We then conjecture that the success of learning is directly related to how diverse downstream tasks are in the vicinity of the initial parameters. We thus design an algorithm that encourages small perturbation to the initial parameter configuration leads to a diverse set of d-way classification tasks. In other words, the proposed algorithm ensures a solution to any downstream task to be near the initial parameter configuration. We empirically evaluate the proposed algorithm on various tasks derived from MNIST with a fully connected network. In these experiments, we observe that our algorithm improves average test accuracy across most of these tasks, and that such improvement is greater when the number of labelled examples is small.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/04/2018

Unsupervised Learning via Meta-Learning

A central goal of unsupervised learning is to acquire representations fr...
research
09/26/2022

Learning to Learn with Generative Models of Neural Network Checkpoints

We explore a data-driven approach for learning to optimize neural networ...
research
11/25/2021

Predicting the success of Gradient Descent for a particular Dataset-Architecture-Initialization (DAI)

Despite their massive success, training successful deep neural networks ...
research
12/13/2017

Evolving Unsupervised Deep Neural Networks for Learning Meaningful Representations

Deep Learning (DL) aims at learning the meaningful representations. A me...
research
06/08/2019

Simultaneous Classification and Novelty Detection Using Deep Neural Networks

Deep neural networks have achieved great success in classification tasks...
research
02/25/2022

An initial alignment between neural network and target is needed for gradient descent to learn

This paper introduces the notion of "Initial Alignment" (INAL) between a...
research
07/06/2018

The Goldilocks zone: Towards better understanding of neural network loss landscapes

We explore the loss landscape of fully-connected neural networks using r...

Please sign up or login with your details

Forgot password? Click here to reset