Optimal Rate of Kernel Regression in Large Dimensions

09/08/2023
by   Weihao Lu, et al.
0

We perform a study on kernel regression for large-dimensional data (where the sample size n is polynomially depending on the dimension d of the samples, i.e., n≍ d^γ for some γ >0 ). We first build a general tool to characterize the upper bound and the minimax lower bound of kernel regression for large dimensional data through the Mendelson complexity ε_n^2 and the metric entropy ε̅_n^2 respectively. When the target function falls into the RKHS associated with a (general) inner product model defined on 𝕊^d, we utilize the new tool to show that the minimax rate of the excess risk of kernel regression is n^-1/2 when n≍ d^γ for γ =2, 4, 6, 8, ⋯. We then further determine the optimal rate of the excess risk of kernel regression for all the γ>0 and find that the curve of optimal rate varying along γ exhibits several new phenomena including the multiple descent behavior and the periodic plateau behavior. As an application, For the neural tangent kernel (NTK), we also provide a similar explicit description of the curve of optimal rate. As a direct corollary, we know these claims hold for wide neural networks as well.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/04/2023

Minimax optimal high-dimensional classification using deep neural networks

High-dimensional classification is a fundamentally important research pr...
research
08/27/2019

On the Risk of Minimum-Norm Interpolants and Restricted Lower Isometry of Kernels

We study the risk of minimum-norm interpolants of data in a Reproducing ...
research
04/09/2021

How rotational invariance of common kernels prevents generalization in high dimensions

Kernel ridge regression is well-known to achieve minimax optimal rates i...
research
04/13/2021

Gradient Kernel Regression

In this article a surprising result is demonstrated using the neural tan...
research
05/07/2023

Sliced Inverse Regression with Large Structural Dimensions

The central space of a joint distribution (,Y) is the minimal subspace 𝒮...
research
03/04/2011

Multiple Kernel Learning: A Unifying Probabilistic Viewpoint

We present a probabilistic viewpoint to multiple kernel learning unifyin...
research
02/12/2023

Generalization Ability of Wide Neural Networks on ℝ

We perform a study on the generalization ability of the wide two-layer R...

Please sign up or login with your details

Forgot password? Click here to reset