Model-based Offline Reinforcement Learning with Count-based Conservatism

07/21/2023
by   Byeongchan Kim, et al.
0

In this paper, we propose a model-based offline reinforcement learning method that integrates count-based conservatism, named $\texttt{Count-MORL}$. Our method utilizes the count estimates of state-action pairs to quantify model estimation error, marking the first algorithm of demonstrating the efficacy of count-based conservatism in model-based offline deep RL to the best of our knowledge. For our proposed method, we first show that the estimation error is inversely proportional to the frequency of state-action pairs. Secondly, we demonstrate that the learned policy under the count-based conservative model offers near-optimality performance guarantees. Through extensive numerical experiments, we validate that $\texttt{Count-MORL}$ with hash code implementation significantly outperforms existing offline RL algorithms on the D4RL benchmark datasets. The code is accessible at $\href{https://github.com/oh-lab/Count-MORL}{https://github.com/oh-lab/Count-MORL}$.

READ FULL TEXT
research
02/22/2023

Behavior Proximal Policy Optimization

Offline reinforcement learning (RL) is a challenging setting where exist...
research
09/16/2023

DOMAIN: MilDly COnservative Model-BAsed OfflINe Reinforcement Learning

Model-based reinforcement learning (RL), which learns environment model ...
research
02/16/2021

COMBO: Conservative Offline Model-Based Policy Optimization

Model-based algorithms, which learn a dynamics model from logged experie...
research
05/21/2021

On Instrumental Variable Regression for Deep Offline Policy Evaluation

We show that the popular reinforcement learning (RL) strategy of estimat...
research
07/01/2021

Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble

Recent advance in deep offline reinforcement learning (RL) has made it p...
research
07/12/2022

Offline Equilibrium Finding

Offline reinforcement learning (Offline RL) is an emerging field that ha...
research
05/04/2023

Masked Trajectory Models for Prediction, Representation, and Control

We introduce Masked Trajectory Models (MTM) as a generic abstraction for...

Please sign up or login with your details

Forgot password? Click here to reset