Exploration in Approximate Hyper-State Space for Meta Reinforcement Learning

by   Luisa Zintgraf, et al.

Meta-learning is a powerful tool for learning policies that can adapt efficiently when deployed in new tasks. If however the meta-training tasks have sparse rewards, the need for exploration during meta-training is exacerbated given that the agent has to explore and learn across many tasks. We show that current meta-learning methods can fail catastrophically in such environments. To address this problem, we propose HyperX, a novel method for meta-learning in sparse reward tasks. Using novel reward bonuses for meta-training, we incentivise the agent to explore in approximate hyper-state space, i.e., the joint state and approximate belief space, where the beliefs are over tasks. We show empirically that these bonuses allow an agent to successfully learn to solve sparse reward tasks where existing meta-learning methods fail.


page 1

page 2

page 3

page 4


Hyper-Meta Reinforcement Learning with Sparse Reward

Despite their success, existing meta reinforcement learning methods stil...

Explore then Execute: Adapting without Rewards via Factorized Meta-Reinforcement Learning

We seek to efficiently learn by leveraging shared structure between diff...

Reward Shaping via Meta-Learning

Reward shaping is one of the most effective methods to tackle the crucia...

Meta-learning curiosity algorithms

We hypothesize that curiosity is a mechanism found by evolution that enc...

MERMAIDE: Learning to Align Learners using Model-Based Meta-Learning

We study how a principal can efficiently and effectively intervene on th...

Learn to Explore: on Bootstrapping Interactive Data Exploration with Meta-learning

Interactive data exploration (IDE) is an effective way of comprehending ...

Synthesized Policies for Transfer and Adaptation across Tasks and Environments

The ability to transfer in reinforcement learning is key towards buildin...

Please sign up or login with your details

Forgot password? Click here to reset