Preconditioning Kernel Matrices

02/22/2016
by   Kurt Cutajar, et al.
0

The computational and storage complexity of kernel machines presents the primary barrier to their scaling to large, modern, datasets. A common way to tackle the scalability issue is to use the conjugate gradient algorithm, which relieves the constraints on both storage (the kernel matrix need not be stored) and computation (both stochastic gradients and parallelization can be used). Even so, conjugate gradient is not without its own issues: the conditioning of kernel matrices is often such that conjugate gradients will have poor convergence in practice. Preconditioning is a common approach to alleviating this issue. Here we propose preconditioned conjugate gradients for kernel machines, and develop a broad range of preconditioners particularly useful for kernel matrices. We describe a scalable approach to both solving kernel machines and learning their hyperparameters. We show this approach is exact in the limit of iterations and outperforms state-of-the-art approximations for a given computational budget.

READ FULL TEXT
research
11/14/2019

Conjugate Gradients for Kernel Machines

Regularized least-squares (kernel-ridge / Gaussian process) regression i...
research
07/21/2014

Scalable Kernel Methods via Doubly Stochastic Gradients

The general perception is that kernel methods are not scalable, and neur...
research
01/17/2016

Learning the kernel matrix via predictive low-rank approximations

Efficient and accurate low-rank approximations of multiple data sources ...
research
11/28/2017

Scalable and Compact 3D Action Recognition with Approximated RBF Kernel Machines

Despite the recent deep learning (DL) revolution, kernel machines still ...
research
12/24/2022

Reconstructing Kernel-based Machine Learning Force Fields with Super-linear Convergence

Kernel machines have sustained continuous progress in the field of quant...
research
02/13/2022

Understanding Natural Gradient in Sobolev Spaces

While natural gradients have been widely studied from both theoretical a...
research
03/11/2022

Performance Analysis and Optimal Node-Aware Communication for Enlarged Conjugate Gradient Methods

Krylov methods are a key way of solving large sparse linear systems of e...

Please sign up or login with your details

Forgot password? Click here to reset