Understanding Self-Predictive Learning for Reinforcement Learning

12/06/2022
by   Yunhao Tang, et al.
0

We study the learning dynamics of self-predictive learning for reinforcement learning, a family of algorithms that learn representations by minimizing the prediction error of their own future latent representations. Despite its recent empirical success, such algorithms have an apparent defect: trivial representations (such as constants) minimize the prediction error, yet it is obviously undesirable to converge to such solutions. Our central insight is that careful designs of the optimization dynamics are critical to learning meaningful representations. We identify that a faster paced optimization of the predictor and semi-gradient updates on the representation, are crucial to preventing the representation collapse. Then in an idealized setup, we show self-predictive learning dynamics carries out spectral decomposition on the state transition matrix, effectively capturing information of the transition dynamics. Building on the theoretical insights, we propose bidirectional self-predictive learning, a novel self-predictive algorithm that learns two representations simultaneously. We examine the robustness of our theoretical insights with a number of small-scale experiments and showcase the promise of the novel representation learning algorithm with large-scale experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/30/2020

Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning

Learning a good representation is an essential component for deep reinfo...
research
05/29/2023

Towards a Better Understanding of Representation Dynamics under TD-learning

TD-learning is a foundation reinforcement learning (RL) algorithm for va...
research
02/25/2021

On The Effect of Auxiliary Tasks on Representation Dynamics

While auxiliary tasks play a key role in shaping the representations lea...
research
08/19/2022

Spectral Decomposition Representation for Reinforcement Learning

Representation learning often plays a critical role in reinforcement lea...
research
02/25/2021

Visualizing MuZero Models

MuZero, a model-based reinforcement learning algorithm that uses a value...
research
05/01/2023

Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition

Representation learning and exploration are among the key challenges for...
research
02/09/2023

The Edge of Orthogonality: A Simple View of What Makes BYOL Tick

Self-predictive unsupervised learning methods such as BYOL or SimSiam ha...

Please sign up or login with your details

Forgot password? Click here to reset