Causally Correct Partial Models for Reinforcement Learning

02/07/2020
by   Danilo J. Rezende, et al.
17

In reinforcement learning, we can learn a model of future observations and rewards, and use it to plan the agent's next actions. However, jointly modeling future observations can be computationally expensive or even intractable if the observations are high-dimensional (e.g. images). For this reason, previous works have considered partial models, which model only part of the observation. In this paper, we show that partial models can be causally incorrect: they are confounded by the observations they don't model, and can therefore lead to incorrect planning. To address this, we introduce a general family of partial models that are provably causally correct, yet remain fast because they do not need to fully model future observations.

READ FULL TEXT

page 8

page 26

page 27

page 28

research
01/15/2014

Learning Partially Observable Deterministic Action Models

We present exact algorithms for identifying deterministic-actions effect...
research
04/28/2017

Mapping Instructions and Visual Observations to Actions with Reinforcement Learning

We propose to directly map raw visual observations and text input to act...
research
11/14/2019

Partial-Order, Partially-Seen Observations of Fluents or Actions for Plan Recognition as Planning

This work aims to make plan recognition as planning more ready for real-...
research
02/08/2016

PAC Reinforcement Learning with Rich Observations

We propose and study a new model for reinforcement learning with rich ob...
research
03/01/2018

On Polynomial Time PAC Reinforcement Learning with Rich Observations

We study the computational tractability of provably sample-efficient (PA...
research
12/08/2020

Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual Model-Based Reinforcement Learning

Model-based reinforcement learning (MBRL) methods have shown strong samp...
research
04/07/2017

Recurrent Environment Simulators

Models that can simulate how environments change in response to actions ...

Please sign up or login with your details

Forgot password? Click here to reset