Simultaneously Updating All Persistence Values in Reinforcement Learning

11/21/2022
by   Luca Sabbioni, et al.
0

In reinforcement learning, the performance of learning agents is highly sensitive to the choice of time discretization. Agents acting at high frequencies have the best control opportunities, along with some drawbacks, such as possible inefficient exploration and vanishing of the action advantages. The repetition of the actions, i.e., action persistence, comes into help, as it allows the agent to visit wider regions of the state space and improve the estimation of the action effects. In this work, we derive a novel All-Persistence Bellman Operator, which allows an effective use of both the low-persistence experience, by decomposition into sub-transition, and the high-persistence experience, thanks to the introduction of a suitable bootstrap procedure. In this way, we employ transitions collected at any time scale to update simultaneously the action values of the considered persistence set. We prove the contraction property of the All-Persistence Bellman Operator and, based on it, we extend classic Q-learning and DQN. After providing a study on the effects of persistence, we experimentally evaluate our approach in both tabular contexts and more challenging frameworks, including some Atari games.

READ FULL TEXT

page 16

page 17

page 18

research
02/17/2020

Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning

The choice of the control frequency of a system has a relevant impact on...
research
01/02/2021

Notes on pivot pairings

We present a row reduction algorithm to compute the barcode decompositio...
research
12/04/2021

Updating Zigzag Persistence and Maintaining Representatives over Changing Filtrations

Computing persistence over changing filtrations give rise to a stack of ...
research
04/23/2018

State Distribution-aware Sampling for Deep Q-learning

A critical and challenging problem in reinforcement learning is how to l...
research
03/06/2013

Possibilistic decreasing persistence

A key issue in the handling of temporal data is the treatment of persist...
research
01/12/2023

Persistence-Based Discretization for Learning Discrete Event Systems from Time Series

To get a good understanding of a dynamical system, it is convenient to h...
research
06/03/2022

A Learning-Based Method for Automatic Operator Selection in the Fanoos XAI System

We describe an extension of the Fanoos XAI system [Bayani et al 2022] wh...

Please sign up or login with your details

Forgot password? Click here to reset