Dyna Planning using a Feature Based Generative Model

05/23/2018
by   Ryan Faulkner, et al.
0

Dyna-style reinforcement learning is a powerful approach for problems where not much real data is available. The main idea is to supplement real trajectories, or sequences of sampled states over time, with simulated ones sampled from a learned model of the environment. However, in large state spaces, the problem of learning a good generative model of the environment has been open so far. We propose to use deep belief networks to learn an environment model for use in Dyna. We present our approach and validate it empirically on problems where the state observations consist of images. Our results demonstrate that using deep belief networks, which are full generative models, significantly outperforms the use of linear expectation models, proposed in Sutton et al. (2008)

READ FULL TEXT
research
06/21/2019

Shaping Belief States with Generative Environment Models for RL

When agents interact with a complex environment, they must form and main...
research
08/18/2022

Learning Generative Models for Active Inference using Tensor Networks

Active inference provides a general framework for behavior and learning ...
research
06/08/2018

Temporal Difference Variational Auto-Encoder

One motivation for learning generative models of environments is to use ...
research
11/14/2014

Deep Belief Network Training Improvement Using Elite Samples Minimizing Free Energy

Nowadays this is very popular to use deep architectures in machine learn...
research
05/22/2016

Factored Temporal Sigmoid Belief Networks for Sequence Learning

Deep conditional generative models are developed to simultaneously learn...
research
08/14/2014

A brief survey on deep belief networks and introducing a new object oriented toolbox (DeeBNet)

Nowadays, this is very popular to use the deep architectures in machine ...
research
01/16/2014

Learning to Make Predictions In Partially Observable Environments Without a Generative Model

When faced with the problem of learning a model of a high-dimensional en...

Please sign up or login with your details

Forgot password? Click here to reset