DeepAI AI Chat
Log In Sign Up

Stochastic Approximation Beyond Gradient for Signal Processing and Machine Learning

by   Aymeric Dieuleveut, et al.

Stochastic approximation (SA) is a classical algorithm that has had since the early days a huge impact on signal processing, and nowadays on machine learning, due to the necessity to deal with a large amount of data observed with uncertainties. An exemplar special case of SA pertains to the popular stochastic (sub)gradient algorithm which is the working horse behind many important applications. A lesser-known fact is that the SA scheme also extends to non-stochastic-gradient algorithms such as compressed stochastic gradient, stochastic expectation-maximization, and a number of reinforcement learning algorithms. The aim of this article is to overview and introduce the non-stochastic-gradient perspectives of SA to the signal processing and machine learning audiences through presenting a design guideline of SA algorithms backed by theories. Our central theme is to propose a general framework that unifies existing theories of SA, including its non-asymptotic and asymptotic convergence results, and demonstrate their applications on popular non-stochastic-gradient algorithms. We build our analysis framework based on classes of Lyapunov functions that satisfy a variety of mild conditions. We draw connections between non-stochastic-gradient algorithms and scenarios when the Lyapunov function is smooth, convex, or strongly convex. Using the said framework, we illustrate the convergence properties of the non-stochastic-gradient algorithms using concrete examples. Extensions to the emerging variance reduction techniques for improved sample complexity will also be discussed.


Non asymptotic analysis of Adaptive stochastic gradient algorithms and applications

In stochastic optimization, a common tool to deal sequentially with larg...

Data-driven Algorithm Selection and Parameter Tuning: Two Case studies in Optimization and Signal Processing

Machine learning algorithms typically rely on optimization subroutines a...

Distributed stochastic gradient tracking algorithm with variance reduction for non-convex optimization

This paper proposes a distributed stochastic algorithm with variance red...

On the rates of convergence of Parallelized Averaged Stochastic Gradient Algorithms

The growing interest for high dimensional and functional data analysis l...

Formalization of a Stochastic Approximation Theorem

Stochastic approximation algorithms are iterative procedures which are u...

Learning with Known Operators reduces Maximum Training Error Bounds

We describe an approach for incorporating prior knowledge into machine l...

Exploiting the Structure: Stochastic Gradient Methods Using Raw Clusters

The amount of data available in the world is growing faster than our abi...