Finite-Sample Analysis of Stochastic Approximation Using Smooth Convex Envelopes

02/03/2020
by   Zaiwei Chen, et al.
0

Stochastic Approximation (SA) is a popular approach for solving fixed point equations where the information is corrupted by noise. In this paper, we consider an SA involving a contraction mapping with respect to an arbitrary norm, and show its finite-sample bound for using either constant or diminishing step sizes. The idea is to construct a smooth Lyapunov function using the generalized Moreau envelope, and show that the iterates of SA are contracting in expectation with respect to that Lyapunov function. The result is applicable to various Reinforcement Learning (RL) algorithms. In particular, we use it to establish the first-known convergence rate of the V-trace algorithm for the off-policy TD-Learning [15], and improve the existing bound for the tabular Q-Learning algorithm. Further, for these two applications, our construction of the Lyapunov functions results in only a logarithmic dependence of the convergence bound on the state-space dimension.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/02/2021

A Lyapunov Theory for Finite-Sample Guarantees of Asynchronous Q-Learning and TD-Learning Variants

This paper develops an unified framework to study finite-sample converge...
research
06/24/2021

Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators

In temporal difference (TD) learning, off-policy sampling is known to be...
research
07/23/2012

Bellman Error Based Feature Generation using Random Projections on Sparse Spaces

We address the problem of automatic generation of features for value fun...
research
09/09/2023

Finite-sample analysis of rotation operator under l_2 norm and l_∞ norm

In this article, we consider a special operator called the two-dimension...
research
08/05/2022

Sample Complexity of Policy-Based Methods under Off-Policy Sampling and Linear Function Approximation

In this work, we study policy-based methods for solving the reinforcemen...
research
07/30/2020

Momentum Q-learning with Finite-Sample Convergence Guarantee

Existing studies indicate that momentum ideas in conventional optimizati...
research
11/01/2019

Generalized Speedy Q-learning

In this paper, we derive a generalization of the Speedy Q-learning (SQL)...

Please sign up or login with your details

Forgot password? Click here to reset