Stochastic Optimization with Bandit Sampling

08/08/2017
by   Farnood Salehi, et al.
0

Many stochastic optimization algorithms work by estimating the gradient of the cost function on the fly by sampling datapoints uniformly at random from a training set. However, the estimator might have a large variance, which inadvertently slows down the convergence rate of the algorithms. One way to reduce this variance is to sample the datapoints from a carefully selected non-uniform distribution. In this work, we propose a novel non-uniform sampling approach that uses the multi-armed bandit framework. Theoretically, we show that our algorithm asymptotically approximates the optimal variance within a factor of 3. Empirically, we show that using this datapoint-selection technique results in a significant reduction in the convergence time and variance of several stochastic optimization algorithms such as SGD, SVRG and SAGA. This approach for sampling datapoints is general, and can be used in conjunction with any algorithm that uses an unbiased gradient estimation -- we expect it to have broad applicability beyond the specific examples explored in this work.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/13/2014

Stochastic Optimization with Importance Sampling

Uniform sampling of training data has been commonly used in traditional ...
research
02/13/2018

Online Variance Reduction for Stochastic Optimization

Modern stochastic optimization methods often rely on uniform sampling wh...
research
03/21/2017

Stochastic Primal Dual Coordinate Method with Non-Uniform Sampling Based on Optimality Violations

We study primal-dual type stochastic optimization algorithms with non-un...
research
03/29/2019

Online Variance Reduction with Mixtures

Adaptive importance sampling for stochastic optimization is a promising ...
research
06/10/2020

Bandit Samplers for Training Graph Neural Networks

Several sampling algorithms with variance reduction have been proposed f...
research
10/24/2020

Adam with Bandit Sampling for Deep Learning

Adam is a widely used optimization method for training deep learning mod...
research
02/18/2023

Stochastic Approximation Approaches to Group Distributionally Robust Optimization

This paper investigates group distributionally robust optimization (GDRO...

Please sign up or login with your details

Forgot password? Click here to reset