Finite-time analysis of single-timescale actor-critic

10/18/2022
by   Xuyang Chen, et al.
0

Despite the great empirical success of actor-critic methods, its finite-time convergence is still poorly understood in its most practical form. In particular, the analysis of single-timescale actor-critic presents significant challenges due to the highly inaccurate critic estimation and the complex error propagation dynamics over iterations. Existing works on analyzing single-timescale actor-critic only focus on the i.i.d. sampling or tabular setting for simplicity, which is rarely the case in practical applications. We consider the more practical online single-timescale actor-critic algorithm on continuous state space, where the critic is updated with a single Markovian sample per actor step. We prove that the online single-timescale actor-critic method is guaranteed to find an ϵ-approximate stationary point with 𝒪(ϵ^-2) sample complexity under standard assumptions, which can be further improved to 𝒪(ϵ^-2) under i.i.d. sampling. Our analysis develops a novel framework that evaluates and controls the error propagation between actor and critic in a systematic way. To our knowledge, this is the first finite-time analysis for online single-timescale actor-critic method. Overall, our results compare favorably to the existing literature on analyzing actor-critic in terms of considering the most practical settings and requiring weaker assumptions.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset