On Acceleration of Gradient-Based Empirical Risk Minimization using Local Polynomial Regression

04/16/2022
by   Ekaterina Trimbach, et al.
0

We study the acceleration of the Local Polynomial Interpolation-based Gradient Descent method (LPI-GD) recently proposed for the approximate solution of empirical risk minimization problems (ERM). We focus on loss functions that are strongly convex and smooth with condition number σ. We additionally assume the loss function is η-Hölder continuous with respect to the data. The oracle complexity of LPI-GD is Õ(σ m^d log(1/ε)) for a desired accuracy ε, where d is the dimension of the parameter space, and m is the cardinality of an approximation grid. The factor m^d can be shown to scale as O((1/ε)^d/2η). LPI-GD has been shown to have better oracle complexity than gradient descent (GD) and stochastic gradient descent (SGD) for certain parameter regimes. We propose two accelerated methods for the ERM problem based on LPI-GD and show an oracle complexity of Õ(√(σ) m^d log(1/ε)). Moreover, we provide the first empirical study on local polynomial interpolation-based gradient methods and corroborate that LPI-GD has better performance than GD and SGD in some scenarios, and the proposed methods achieve acceleration.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/04/2020

Gradient-Based Empirical Risk Minimization using Local Polynomial Regression

In this paper, we consider the problem of empirical risk minimization (E...
research
06/16/2020

Federated Accelerated Stochastic Gradient Descent

We propose Federated Accelerated Stochastic Gradient Descent (FedAc), a ...
research
04/08/2022

Decision-Dependent Risk Minimization in Geometrically Decaying Dynamic Environments

This paper studies the problem of expected loss minimization given a dat...
research
01/09/2020

How to trap a gradient flow

We consider the problem of finding an ε-approximate stationary point of ...
research
01/17/2020

Gradient descent with momentum — to accelerate or to super-accelerate?

We consider gradient descent with `momentum', a widely used method for l...
research
03/08/2018

Fast Convergence for Stochastic and Distributed Gradient Descent in the Interpolation Limit

Modern supervised learning techniques, particularly those using so calle...
research
06/24/2016

Stochastic Multiple Choice Learning for Training Diverse Deep Ensembles

Many practical perception systems exist within larger processes that inc...

Please sign up or login with your details

Forgot password? Click here to reset