Does Momentum Help? A Sample Complexity Analysis
Momentum methods are popularly used in accelerating stochastic iterative methods. Although a fair amount of literature is dedicated to momentum in stochastic optimisation, there are limited results that quantify the benefits of using heavy ball momentum in the specific case of stochastic approximation algorithms. We first show that the convergence rate with optimal step size does not improve when momentum is used (under some assumptions). Secondly, to quantify the behaviour in the initial phase we analyse the sample complexity of iterates with and without momentum. We show that the sample complexity bound for SA without momentum is 𝒪̃(1/αλ_min(A)) while for SA with momentum is 𝒪̃(1/√(αλ_min(A))), where α is the step size and λ_min(A) is the smallest eigenvalue of the driving matrix A. Although the sample complexity bound for SA with momentum is better for small enough α, it turns out that for optimal choice of α in the two cases, the sample complexity bounds are of the same order.
READ FULL TEXT