On the Practical Consistency of Meta-Reinforcement Learning Algorithms

12/01/2021
by   Zheng Xiong, et al.
0

Consistency is the theoretical property of a meta learning algorithm that ensures that, under certain assumptions, it can adapt to any task at test time. An open question is whether and how theoretical consistency translates into practice, in comparison to inconsistent algorithms. In this paper, we empirically investigate this question on a set of representative meta-RL algorithms. We find that theoretically consistent algorithms can indeed usually adapt to out-of-distribution (OOD) tasks, while inconsistent ones cannot, although they can still fail in practice for reasons like poor exploration. We further find that theoretically inconsistent algorithms can be made consistent by continuing to update all agent components on the OOD tasks, and adapt as well or better than originally consistent ones. We conclude that theoretical consistency is indeed a desirable property, and inconsistent meta-RL algorithms can easily be made consistent to enjoy the same benefits.

READ FULL TEXT

page 6

page 7

page 12

page 15

page 16

page 17

research
07/06/2021

Meta-Reinforcement Learning for Heuristic Planning

In Meta-Reinforcement Learning (meta-RL) an agent is trained on a set of...
research
09/18/2021

Hindsight Foresight Relabeling for Meta-Reinforcement Learning

Meta-reinforcement learning (meta-RL) algorithms allow for agents to lea...
research
11/02/2020

Information-theoretic Task Selection for Meta-Reinforcement Learning

In Meta-Reinforcement Learning (meta-RL) an agent is trained on a set of...
research
07/27/2011

Time Consistent Discounting

A possibly immortal agent tries to maximise its summed discounted reward...
research
06/07/2022

On the Effectiveness of Fine-tuning Versus Meta-reinforcement Learning

Intelligent agents should have the ability to leverage knowledge from pr...
research
05/10/2021

Rethinking and Reweighting the Univariate Losses for Multi-Label Ranking: Consistency and Generalization

(Partial) ranking loss is a commonly used evaluation measure for multi-l...
research
01/30/2017

Reinforcement Learning Algorithm Selection

This paper formalises the problem of online algorithm selection in the c...

Please sign up or login with your details

Forgot password? Click here to reset