Universal Successor Features for Transfer Reinforcement Learning

01/05/2020
by   Chen Ma, et al.
46

Transfer in Reinforcement Learning (RL) refers to the idea of applying knowledge gained from previous tasks to solving related tasks. Learning a universal value function (Schaul et al., 2015), which generalizes over goals and states, has previously been shown to be useful for transfer. However, successor features are believed to be more suitable than values for transfer (Dayan, 1993; Barreto et al.,2017), even though they cannot directly generalize to new goals. In this paper, we propose (1) Universal Successor Features (USFs) to capture the underlying dynamics of the environment while allowing generalization to unseen goals and (2) a flexible end-to-end model of USFs that can be trained by interacting with the environment. We show that learning USFs is compatible with any RL algorithm that learns state values using a temporal difference method. Our experiments in a simple gridworld and with two MuJoCo environments show that USFs can greatly accelerate training when learning multiple tasks and can effectively transfer knowledge to new tasks.

READ FULL TEXT
research
04/11/2018

Universal Successor Representations for Transfer Reinforcement Learning

The objective of transfer reinforcement learning is to generalize from a...
research
09/25/2018

Floyd-Warshall Reinforcement Learning Learning from Past Experiences to Reach New Goals

Consider mutli-goal tasks that involve static environments and dynamic g...
research
09/14/2022

Learning state correspondence of reinforcement learning tasks for knowledge transfer

Deep reinforcement learning has shown an ability to achieve super-human ...
research
12/18/2018

Universal Successor Features Approximators

The ability of a reinforcement learning (RL) agent to learn about many r...
research
06/09/2021

Self-Paced Context Evaluation for Contextual Reinforcement Learning

Reinforcement learning (RL) has made a lot of advances for solving a sin...
research
06/09/2019

Transfer Learning by Modeling a Distribution over Policies

Exploration and adaptation to new tasks in a transfer learning setup is ...
research
04/24/2020

Evolution of Q Values for Deep Q Learning in Stable Baselines

We investigate the evolution of the Q values for the implementation of D...

Please sign up or login with your details

Forgot password? Click here to reset