Stacked Kernel Network

by   Shuai Zhang, et al.

Kernel methods are powerful tools to capture nonlinear patterns behind data. They implicitly learn high (even infinite) dimensional nonlinear features in the Reproducing Kernel Hilbert Space (RKHS) while making the computation tractable by leveraging the kernel trick. Classic kernel methods learn a single layer of nonlinear features, whose representational power may be limited. Motivated by recent success of deep neural networks (DNNs) that learn multi-layer hierarchical representations, we propose a Stacked Kernel Network (SKN) that learns a hierarchy of RKHS-based nonlinear features. SKN interleaves several layers of nonlinear transformations (from a linear space to a RKHS) and linear transformations (from a RKHS to a linear space). Similar to DNNs, a SKN is composed of multiple layers of hidden units, but each parameterized by a RKHS function rather than a finite-dimensional vector. We propose three ways to represent the RKHS functions in SKN: (1)nonparametric representation, (2)parametric representation and (3)random Fourier feature representation. Furthermore, we expand SKN into CNN architecture called Stacked Kernel Convolutional Network (SKCN). SKCN learning a hierarchy of RKHS-based nonlinear features by convolutional operation with each filter also parameterized by a RKHS function rather than a finite-dimensional matrix in CNN, which is suitable for image inputs. Experiments on various datasets demonstrate the effectiveness of SKN and SKCN, which outperform the competitive methods.


page 1

page 2

page 3

page 4


Neural Generalization of Multiple Kernel Learning

Multiple Kernel Learning is a conventional way to learn the kernel funct...

Kernel Methods for the Approximation of Nonlinear Systems

We introduce a data-driven order reduction method for nonlinear control ...

Balanced Reduction of Nonlinear Control Systems in Reproducing Kernel Hilbert Space

We introduce a novel data-driven order reduction method for nonlinear co...

No-Trick (Treat) Kernel Adaptive Filtering using Deterministic Features

Kernel methods form a powerful, versatile, and theoretically-grounded un...

Learning a Robust Representation via a Deep Network on Symmetric Positive Definite Manifolds

Recent studies have shown that aggregating convolutional features of a p...

Kernel similarity matching with Hebbian neural networks

Recent works have derived neural networks with online correlation-based ...

Random features for adaptive nonlinear control and prediction

A key assumption in the theory of adaptive control for nonlinear systems...

Please sign up or login with your details

Forgot password? Click here to reset