Robust Entropy-regularized Markov Decision Processes

12/31/2021
by   Tien Mai, et al.
8

Stochastic and soft optimal policies resulting from entropy-regularized Markov decision processes (ER-MDP) are desirable for exploration and imitation learning applications. Motivated by the fact that such policies are sensitive with respect to the state transition probabilities, and the estimation of these probabilities may be inaccurate, we study a robust version of the ER-MDP model, where the stochastic optimal policies are required to be robust with respect to the ambiguity in the underlying transition probabilities. Our work is at the crossroads of two important schemes in reinforcement learning (RL), namely, robust MDP and entropy regularized MDP. We show that essential properties that hold for the non-robust ER-MDP and robust unregularized MDP models also hold in our settings, making the robust ER-MDP problem tractable. We show how our framework and results can be integrated into different algorithmic schemes including value or (modified) policy iteration, which would lead to new robust RL and inverse RL algorithms to handle uncertainties. Analyses on computational complexity and error propagation under conventional uncertainty settings are also provided.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/18/2020

A Relation Analysis of Markov Decision Process Frameworks

We study the relation between different Markov Decision Process (MDP) fr...
research
06/24/2023

Decision-Dependent Distributionally Robust Markov Decision Process Method in Dynamic Epidemic Control

In this paper, we present a Distributionally Robust Markov Decision Proc...
research
09/28/2022

Online Policy Optimization for Robust MDP

Reinforcement learning (RL) has exceeded human performance in many synth...
research
09/13/2019

Reinforcement Learning: a Comparison of UCB Versus Alternative Adaptive Policies

In this paper we consider the basic version of Reinforcement Learning (R...
research
06/17/2020

Parameterized MDPs and Reinforcement Learning Problems – A Maximum Entropy Principle Based Framework

We present a framework to address a class of sequential decision making ...
research
06/07/2021

Closed-Form Analytical Results for Maximum Entropy Reinforcement Learning

We introduce a mapping between Maximum Entropy Reinforcement Learning (M...
research
10/21/2022

Group Distributionally Robust Reinforcement Learning with Hierarchical Latent Variables

One key challenge for multi-task Reinforcement learning (RL) in practice...

Please sign up or login with your details

Forgot password? Click here to reset