The Information Geometry of Unsupervised Reinforcement Learning

10/06/2021
by   Benjamin Eysenbach, et al.
0

How can a reinforcement learning (RL) agent prepare to solve downstream tasks if those tasks are not known a priori? One approach is unsupervised skill discovery, a class of algorithms that learn a set of policies without access to a reward function. Such algorithms bear a close resemblance to representation learning algorithms (e.g., contrastive learning) in supervised learning, in that both are pretraining algorithms that maximize some approximation to a mutual information objective. While prior work has shown that the set of skills learned by such methods can accelerate downstream RL tasks, prior work offers little analysis into whether these skill learning algorithms are optimal, or even what notion of optimality would be appropriate to apply to them. In this work, we show that unsupervised skill discovery algorithms based on mutual information maximization do not learn skills that are optimal for every possible reward function. However, we show that the distribution over skills provides an optimal initialization minimizing regret against adversarially-chosen reward functions, assuming a certain type of adaptation procedure. Our analysis also provides a geometric perspective on these skill learning methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/01/2022

CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery

We introduce Contrastive Intrinsic Control (CIC), an algorithm for unsup...
research
10/14/2022

Skill-Based Reinforcement Learning with Intrinsic Reward Matching

While unsupervised skill discovery has shown promise in autonomously acq...
research
10/15/2021

Wasserstein Unsupervised Reinforcement Learning

Unsupervised reinforcement learning aims to train agents to learn a hand...
research
07/18/2021

Unsupervised Skill-Discovery and Skill-Learning in Minecraft

Pre-training Reinforcement Learning agents in a task-agnostic manner has...
research
10/06/2022

Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery

Deep Reinforcement Learning (RL) has emerged as a powerful paradigm for ...
research
12/08/2022

Learning Options via Compression

Identifying statistical regularities in solutions to some tasks in multi...
research
08/24/2023

APART: Diverse Skill Discovery using All Pairs with Ascending Reward and DropouT

We study diverse skill discovery in reward-free environments, aiming to ...

Please sign up or login with your details

Forgot password? Click here to reset