A Probabilistic Interpretation of Self-Paced Learning with Applications to Reinforcement Learning

02/25/2021
by   Pascal Klink, et al.
21

Across machine learning, the use of curricula has shown strong empirical potential to improve learning from data by avoiding local optima of training objectives. For reinforcement learning (RL), curricula are especially interesting, as the underlying optimization has a strong tendency to get stuck in local optima due to the exploration-exploitation trade-off. Recently, a number of approaches for an automatic generation of curricula for RL have been shown to increase performance while requiring less expert knowledge compared to manually designed curricula. However, these approaches are seldomly investigated from a theoretical perspective, preventing a deeper understanding of their mechanics. In this paper, we present an approach for automated curriculum generation in RL with a clear theoretical underpinning. More precisely, we formalize the well-known self-paced learning paradigm as inducing a distribution over training tasks, which trades off between task complexity and the objective to match a desired task distribution. Experiments show that training on this induced distribution helps to avoid poor local optima across RL algorithms in different tasks with uninformative rewards and challenging exploration requirements.

READ FULL TEXT

page 16

page 17

page 20

page 39

page 41

research
04/24/2020

Self-Paced Deep Reinforcement Learning

Generalization and reuse of agent behaviour across a variety of learning...
research
12/24/2022

Understanding the Complexity Gains of Single-Task RL with a Curriculum

Reinforcement learning (RL) problems can be challenging without well-sha...
research
05/25/2023

Reward-Machine-Guided, Self-Paced Reinforcement Learning

Self-paced reinforcement learning (RL) aims to improve the data efficien...
research
12/30/2022

Reinforcement Learning with Success Induced Task Prioritization

Many challenging reinforcement learning (RL) problems require designing ...
research
08/04/2021

Parallelized Reverse Curriculum Generation

For reinforcement learning (RL), it is challenging for an agent to maste...
research
01/27/2023

Outcome-directed Reinforcement Learning by Uncertainty Temporal Distance-Aware Curriculum Goal Generation

Current reinforcement learning (RL) often suffers when solving a challen...
research
11/28/2022

Improved Representation of Asymmetrical Distances with Interval Quasimetric Embeddings

Asymmetrical distance structures (quasimetrics) are ubiquitous in our li...

Please sign up or login with your details

Forgot password? Click here to reset