Closed-Form Analytical Results for Maximum Entropy Reinforcement Learning

06/07/2021
by   Argenis Arriojas, et al.
0

We introduce a mapping between Maximum Entropy Reinforcement Learning (MaxEnt RL) and Markovian processes conditioned on rare events. In the long time limit, this mapping allows us to derive analytical expressions for the optimal policy, dynamics and initial state distributions for the general case of stochastic dynamics in MaxEnt RL. We find that soft-𝒬 functions in MaxEnt RL can be obtained from the Perron-Frobenius eigenvalue and the corresponding left eigenvector of a regular, non-negative matrix derived from the underlying Markov Decision Process (MDP). The results derived lead to novel algorithms for model-based and model-free MaxEnt RL, which we validate by numerical simulations. The mapping established in this work opens further avenues for the application of novel analytical and computational approaches to problems in MaxEnt RL. We make our code available at: https://github.com/argearriojas/maxent-rl-mdp-scripts

READ FULL TEXT
research
12/06/2019

Observational Overfitting in Reinforcement Learning

A major component of overfitting in model-free reinforcement learning (R...
research
12/23/2019

Direct and indirect reinforcement learning

Reinforcement learning (RL) algorithms have been successfully applied to...
research
12/31/2021

Robust Entropy-regularized Markov Decision Processes

Stochastic and soft optimal policies resulting from entropy-regularized ...
research
06/16/2021

Safe Reinforcement Learning Using Advantage-Based Intervention

Many sequential decision problems involve finding a policy that maximize...
research
01/31/2019

Tsallis Reinforcement Learning: A Unified Framework for Maximum Entropy Reinforcement Learning

In this paper, we present a new class of Markov decision processes (MDPs...
research
04/08/2021

Efficient time stepping for numerical integration using reinforcement learning

Many problems in science and engineering require the efficient numerical...
research
11/29/2019

When Blockchain Meets AI: Optimal Mining Strategy Achieved By Machine Learning

This work applies reinforcement learning (RL) from the AI machine learni...

Please sign up or login with your details

Forgot password? Click here to reset