Diversity sampling is an implicit regularization for kernel methods

by   Michaël Fanuel, et al.

Kernel methods have achieved very good performance on large scale regression and classification problems, by using the Nyström method and preconditioning techniques. The Nyström approximation – based on a subset of landmarks – gives a low rank approximation of the kernel matrix, and is known to provide a form of implicit regularization. We further elaborate on the impact of sampling diverse landmarks for constructing the Nyström approximation in supervised as well as unsupervised kernel methods. By using Determinantal Point Processes for sampling, we obtain additional theoretical results concerning the interplay between diversity and regularization. Empirically, we demonstrate the advantages of training kernel methods based on subsets made of diverse points. In particular, if the dataset has a dense bulk and a sparser tail, we show that Nyström kernel regression with diverse landmarks increases the accuracy of the regression in sparser regions of the dataset, with respect to a uniform landmark sampling. A greedy heuristic is also proposed to select diverse samples of significant size within large datasets when exact DPP sampling is not practically feasible.


page 1

page 2

page 3

page 4


Ensemble Kernel Methods, Implicit Regularization and Determinental Point Processes

By using the framework of Determinantal Point Processes (DPPs), some the...

Towards Deterministic Diverse Subset Sampling

Determinantal point processes (DPPs) are well known models for diverse s...

Nyström landmark sampling and regularized Christoffel functions

Selecting diverse and important items from a large set is a problem of i...

On Column Selection in Approximate Kernel Canonical Correlation Analysis

We study the problem of column selection in large-scale kernel canonical...

Improving Sample and Feature Selection with Principal Covariates Regression

Selecting the most relevant features and samples out of a large set of c...

Learning the Parameters of Determinantal Point Process Kernels

Determinantal point processes (DPPs) are well-suited for modeling repuls...

Kernel Ridge Regression Using Importance Sampling with Application to Seismic Response Prediction

Scalable kernel methods, including kernel ridge regression, often rely o...

Please sign up or login with your details

Forgot password? Click here to reset