Variance-Reduced Methods for Machine Learning

10/02/2020
by   Robert M. Gower, et al.
20

Stochastic optimization lies at the heart of machine learning, and its cornerstone is stochastic gradient descent (SGD), a method introduced over 60 years ago. The last 8 years have seen an exciting new development: variance reduction (VR) for stochastic optimization methods. These VR methods excel in settings where more than one pass through the training data is allowed, achieving a faster convergence than SGD in theory as well as practice. These speedups underline the surge of interest in VR methods and the fast-growing body of work on this topic. This review covers the key principles and main developments behind VR methods for optimization with finite data sets and is aimed at non-expert readers. We focus mainly on the convex setting, and leave pointers to readers interested in extensions for minimizing non-convex functions.

READ FULL TEXT
research
04/17/2017

Larger is Better: The Effect of Learning Rates Enjoyed by Stochastic Optimization with Progressive Variance Reduction

In this paper, we propose a simple variant of the original stochastic va...
research
12/05/2015

Variance Reduction for Distributed Stochastic Gradient Descent

Variance reduction (VR) methods boost the performance of stochastic grad...
research
02/26/2018

VR-SGD: A Simple Stochastic Variance Reduction Method for Machine Learning

In this paper, we propose a simple variant of the original SVRG, called ...
research
02/18/2021

SVRG Meets AdaGrad: Painless Variance Reduction

Variance reduction (VR) methods for finite-sum minimization typically re...
research
07/11/2022

Randomized Kaczmarz Method for Single Particle X-ray Image Phase Retrieval

In this paper, we investigate phase retrieval algorithm for the single p...
research
08/04/2023

A stochastic optimization approach to train non-linear neural networks with a higher-order variation regularization

While highly expressive parametric models including deep neural networks...
research
09/29/2022

META-STORM: Generalized Fully-Adaptive Variance Reduced SGD for Unbounded Functions

We study the application of variance reduction (VR) techniques to genera...

Please sign up or login with your details

Forgot password? Click here to reset