Understanding Decision-Time vs. Background Planning in Model-Based Reinforcement Learning

by   Safa Alver, et al.
McGill University

In model-based reinforcement learning, an agent can leverage a learned model to improve its way of behaving in different ways. Two prevalent approaches are decision-time planning and background planning. In this study, we are interested in understanding under what conditions and in which settings one of these two planning styles will perform better than the other in domains that require fast responses. After viewing them through the lens of dynamic programming, we first consider the classical instantiations of these planning styles and provide theoretical results and hypotheses on which one will perform better in the pure planning, planning learning, and transfer learning settings. We then consider the modern instantiations of these planning styles and provide hypotheses on which one will perform better in the last two of the considered settings. Lastly, we perform several illustrative experiments to empirically validate both our theoretical results and hypotheses. Overall, our findings suggest that even though decision-time planning does not perform as well as background planning in their classical instantiations, in their modern instantiations, it can perform on par or better than background planning in both the planning learning and transfer learning settings.


Minimal Value-Equivalent Partial Models for Scalable and Robust Planning in Lifelong Reinforcement Learning

Learning models of the environment from pure interaction is often consid...

Discriminator Augmented Model-Based Reinforcement Learning

By planning through a learned dynamics model, model-based reinforcement ...

Think Too Fast Nor Too Slow: The Computational Trade-off Between Planning And Reinforcement Learning

Planning and reinforcement learning are two key approaches to sequential...

Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies

State-of-the-art efficient model-based Reinforcement Learning (RL) algor...

On the role of planning in model-based deep reinforcement learning

Model-based planning is often thought to be necessary for deep, careful ...

Multi-Advisor Reinforcement Learning

We consider tackling a single-agent RL problem by distributing it to n l...

Goal-Space Planning with Subgoal Models

This paper investigates a new approach to model-based reinforcement lear...

Please sign up or login with your details

Forgot password? Click here to reset