Learning Bounds for Greedy Approximation with Explicit Feature Maps from Multiple Kernels

10/09/2018
by   Shahin Shahrampour, et al.
0

Nonlinear kernels can be approximated using finite-dimensional feature maps for efficient risk minimization. Due to the inherent trade-off between the dimension of the (mapped) feature space and the approximation accuracy, the key problem is to identify promising (explicit) features leading to a satisfactory out-of-sample performance. In this work, we tackle this problem by efficiently choosing such features from multiple kernels in a greedy fashion. Our method sequentially selects these explicit features from a set of candidate features using a correlation metric. We establish an out-of-sample error bound capturing the trade-off between the error in terms of explicit features (approximation error) and the error due to spectral properties of the best model in the Hilbert space associated to the combined kernel (spectral error). The result verifies that when the (best) underlying data model is sparse enough, i.e., the spectral error is negligible, one can control the test error with a small number of explicit features, that can scale poly-logarithmically with data. Our empirical results show that given a fixed number of explicit features, the method can achieve a lower test error with a smaller time cost, compared to the state-of-the-art in data-dependent random features.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/17/2013

Compact Random Feature Maps

Kernel approximation using randomized feature maps has recently gained a...
research
03/02/2017

A Unifying View of Explicit and Implicit Feature Maps for Structured Data: Systematic Studies of Graph Kernels

Non-linear kernel methods can be approximated by fast linear ones using ...
research
04/07/2015

Tensor machines for learning target-specific polynomial features

Recent years have demonstrated that using random feature maps can signif...
research
02/04/2022

Complex-to-Real Random Features for Polynomial Kernels

Kernel methods are ubiquitous in statistical modeling due to their theor...
research
12/04/2020

Adaptive Explicit Kernel Minkowski Weighted K-means

The K-means algorithm is among the most commonly used data clustering me...
research
12/19/2017

On Data-Dependent Random Features for Improved Generalization in Supervised Learning

The randomized-feature approach has been successfully employed in large-...
research
05/10/2023

Principal Feature Detection via Φ-Sobolev Inequalities

We investigate the approximation of high-dimensional target measures as ...

Please sign up or login with your details

Forgot password? Click here to reset