Layer-wise Learning of Kernel Dependence Networks

06/15/2020
by   Chieh Wu, et al.
17

We propose a greedy strategy to train a deep network for multi-class classification, where each layer is defined as a composition of a linear projection and a nonlinear mapping. This nonlinear mapping is defined as the feature map of a Gaussian kernel, and the linear projection is learned by maximizing the dependence between the layer output and the labels, using the Hilbert Schmidt Independence Criterion (HSIC) as the dependence measure. Since each layer is trained greedily in sequence, all learning is local, and neither backpropagation nor even gradient descent is needed. The depth and width of the network are determined via natural guidelines, and the procedure regularizes its weights in the linear layer. As the key theoretical result, the function class represented by the network is proved to be sufficiently rich to learn any dataset labeling using a finite number of layers, in the sense of reaching minimum mean-squared error or cross-entropy, as long as no two data points with different labels coincide. Experiments demonstrate good generalization performance of the greedy approach across multiple benchmarks while showing a significant computational advantage against a multilayer perceptron of the same complexity trained globally by backpropagation.

READ FULL TEXT

page 8

page 28

page 29

page 30

page 31

research
11/04/2020

Kernel Dependence Network

We propose a greedy strategy to spectrally train a deep network for mult...
research
02/11/2018

Learning Multiple Levels of Representations with Kernel Machines

We propose a connectionist-inspired kernel machine model with three key ...
research
08/05/2019

The HSIC Bottleneck: Deep Learning without Back-Propagation

We introduce the HSIC (Hilbert-Schmidt independence criterion) bottlenec...
research
10/31/2022

Globally Gated Deep Linear Networks

Recently proposed Gated Linear Networks present a tractable nonlinear ne...
research
01/03/2018

Implementation of Deep Convolutional Neural Network in Multi-class Categorical Image Classification

Convolutional Neural Networks has been implemented in many complex machi...
research
09/09/2020

From Two-Class Linear Discriminant Analysis to Interpretable Multilayer Perceptron Design

A closed-form solution exists in two-class linear discriminant analysis ...
research
09/09/2023

Generalized Minimum Error with Fiducial Points Criterion for Robust Learning

The conventional Minimum Error Entropy criterion (MEE) has its limitatio...

Please sign up or login with your details

Forgot password? Click here to reset