Few-shot Quality-Diversity Optimisation

09/14/2021
by   Achkan Salehi, et al.
0

In the past few years, a considerable amount of research has been dedicated to the exploitation of previous learning experiences and the design of Few-shot and Meta Learning approaches, in problem domains ranging from Computer Vision to Reinforcement Learning based control. A notable exception, where to the best of our knowledge, little to no effort has been made in this direction is Quality-Diversity (QD) optimisation. QD methods have been shown to be effective tools in dealing with deceptive minima and sparse rewards in Reinforcement Learning. However, they remain costly due to their reliance on inherently sample inefficient evolutionary processes. We show that, given examples from a task distribution, information about the paths taken by optimisation in parameter space can be leveraged to build a prior population, which when used to initialise QD methods in unseen environments, allows for few-shot adaptation. Our proposed method does not require backpropagation. It is simple to implement and scale, and furthermore, it is agnostic to the underlying models that are being trained. Experiments carried in both sparse and dense reward settings using robotic manipulation and navigation benchmarks show that it considerably reduces the number of generations that are required for QD optimisation in these environments.

READ FULL TEXT
research
10/09/2020

Characterizing Policy Divergence for Personalized Meta-Reinforcement Learning

Despite ample motivation from costly exploration and limited trajectory ...
research
11/22/2022

Efficient Exploration using Model-Based Quality-Diversity with Gradients

Exploration is a key challenge in Reinforcement Learning, especially in ...
research
02/04/2022

A Discourse on MetODS: Meta-Optimized Dynamical Synapses for Meta-Reinforcement Learning

Recent meta-reinforcement learning work has emphasized the importance of...
research
09/09/2020

Solving Challenging Dexterous Manipulation Tasks With Trajectory Optimisation and Reinforcement Learning

Training agents to autonomously learn how to use anthropomorphic robotic...
research
05/25/2022

Fast Inference and Transfer of Compositional Task Structures for Few-shot Task Generalization

We tackle real-world problems with complex structures beyond the pixel-b...
research
10/11/2022

Discovered Policy Optimisation

Tremendous progress has been made in reinforcement learning (RL) over th...
research
06/06/2023

Learning to Do or Learning While Doing: Reinforcement Learning and Bayesian Optimisation for Online Continuous Tuning

Online tuning of real-world plants is a complex optimisation problem tha...

Please sign up or login with your details

Forgot password? Click here to reset