Finite-Time Error Analysis of Asynchronous Q-Learning with Discrete-Time Switching System Models

02/17/2021
by   Donghwan Lee, et al.
0

This paper develops a novel framework to analyze the convergence of Q-learning algorithm by using its connections to dynamical systems. We prove that asynchronous Q-learning with a constant step-size can be naturally formulated as discrete-time stochastic switched linear systems. Moreover, the evolution of the Q-learning estimation error is over- and underestimated by trajectories of two dynamical systems. Based on the schemes, a new finite-time analysis of the Q-learning is given with a finite-time error bound. It offers novel intuitive insights on analysis of Q-learning mainly based on control theoretic frameworks. By filling the gap between both domains in a synergistic way, this approach can potentially facilitate further progress in each field.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro