Bootstrapped Representations in Reinforcement Learning

by   Charline Le Lan, et al.

In reinforcement learning (RL), state representations are key to dealing with large or continuous state spaces. While one of the promises of deep learning algorithms is to automatically construct features well-tuned for the task they try to solve, such a representation might not emerge from end-to-end training of deep RL agents. To mitigate this issue, auxiliary objectives are often incorporated into the learning process and help shape the learnt state representation. Bootstrapping methods are today's method of choice to make these additional predictions. Yet, it is unclear which features these algorithms capture and how they relate to those from other auxiliary-task-based approaches. In this paper, we address this gap and provide a theoretical characterization of the state representation learnt by temporal difference learning (Sutton, 1988). Surprisingly, we find that this representation differs from the features learned by Monte Carlo and residual gradient algorithms for most transition structures of the environment in the policy evaluation setting. We describe the efficacy of these representations for policy evaluation, and use our theoretical analysis to design new auxiliary learning rules. We complement our theoretical results with an empirical comparison of these learning rules for different cumulant functions on classic domains such as the four-room domain (Sutton et al, 1999) and Mountain Car (Moore, 1990).


page 1

page 5

page 6

page 8

page 9

page 13

page 15

page 16


On The Effect of Auxiliary Tasks on Representation Dynamics

While auxiliary tasks play a key role in shaping the representations lea...

Representation Learning for Continuous Action Spaces is Beneficial for Efficient Policy Learning

Deep reinforcement learning (DRL) breaks through the bottlenecks of trad...

Temporal Disentanglement of Representations for Improved Generalisation in Reinforcement Learning

In real-world robotics applications, Reinforcement Learning (RL) agents ...

On the Generalization of Representations in Reinforcement Learning

In reinforcement learning, state representations are used to tractably d...

Contrastive Learning as Goal-Conditioned Reinforcement Learning

In reinforcement learning (RL), it is easier to solve a task if given a ...

Learning Bellman Complete Representations for Offline Policy Evaluation

We study representation learning for Offline Reinforcement Learning (RL)...

Minimax Weight and Q-Function Learning for Off-Policy Evaluation

We provide theoretical investigations into off-policy evaluation in rein...

Please sign up or login with your details

Forgot password? Click here to reset