Accelerating Goal-Directed Reinforcement Learning by Model Characterization

01/04/2019
by   Shoubhik Debnath, et al.
34

We propose a hybrid approach aimed at improving the sample efficiency in goal-directed reinforcement learning. We do this via a two-step mechanism where firstly, we approximate a model from Model-Free reinforcement learning. Then, we leverage this approximate model along with a notion of reachability using Mean First Passage Times to perform Model-Based reinforcement learning. Built on such a novel observation, we design two new algorithms - Mean First Passage Time based Q-Learning (MFPT-Q) and Mean First Passage Time based DYNA (MFPT-DYNA), that have been fundamentally modified from the state-of-the-art reinforcement learning techniques. Preliminary results have shown that our hybrid approaches converge with much fewer iterations than their corresponding state-of-the-art counterparts and therefore requiring much fewer samples and much fewer training trials to converge.

READ FULL TEXT

page 3

page 5

page 7

research
05/30/2018

Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models

Model-based reinforcement learning (RL) algorithms can attain excellent ...
research
06/16/2020

Model Embedding Model-Based Reinforcement Learning

Model-based reinforcement learning (MBRL) has shown its advantages in sa...
research
03/23/2020

Do recent advancements in model-based deep reinforcement learning really improve data efficiency?

Reinforcement learning (RL) has seen great advancements in the past few ...
research
08/30/2022

Model-Based Reinforcement Learning with SINDy

We draw on the latest advancements in the physics community to propose a...
research
06/16/2018

BaRC: Backward Reachability Curriculum for Robotic Reinforcement Learning

Model-free Reinforcement Learning (RL) offers an attractive approach to ...
research
12/17/2022

Comparison of Model-Free and Model-Based Learning-Informed Planning for PointGoal Navigation

In recent years several learning approaches to point goal navigation in ...
research
01/31/2023

Learning, Fast and Slow: A Goal-Directed Memory-Based Approach for Dynamic Environments

Model-based next state prediction and state value prediction are slow to...

Please sign up or login with your details

Forgot password? Click here to reset