Optimistic Active Exploration of Dynamical Systems

06/21/2023
by   Bhavya Sukhija, et al.
0

Reinforcement learning algorithms commonly seek to optimize policies for solving one particular task. How should we explore an unknown dynamical system such that the estimated model allows us to solve multiple downstream tasks in a zero-shot manner? In this paper, we address this challenge, by developing an algorithm – OPAX – for active exploration. OPAX uses well-calibrated probabilistic models to quantify the epistemic uncertainty about the unknown dynamics. It optimistically – w.r.t. to plausible dynamics – maximizes the information gain between the unknown dynamics and state observations. We show how the resulting optimization problem can be reduced to an optimal control problem that can be solved at each episode using standard approaches. We analyze our algorithm for general models, and, in the case of Gaussian process dynamics, we give a sample complexity bound and show that the epistemic uncertainty converges to zero. In our experiments, we compare OPAX with other heuristic active exploration approaches on several environments. Our experiments show that OPAX is not only theoretically sound but also performs well for zero-shot planning on novel downstream tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2020

Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning

Model-based reinforcement learning algorithms with probabilistic dynamic...
research
05/12/2020

Planning to Explore via Self-Supervised World Models

Reinforcement learning allows solving complex tasks, however, the learni...
research
06/19/2021

Learning to Reach, Swim, Walk and Fly in One Trial: Data-Driven Control with Scarce Data and Side Information

We develop a learning-based control algorithm for unknown dynamical syst...
research
06/22/2022

Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation

It has been a long-standing dream to design artificial agents that explo...
research
02/24/2023

Leveraging Jumpy Models for Planning and Fast Learning in Robotic Domains

In this paper we study the problem of learning multi-step dynamics predi...
research
05/20/2023

Bridging Active Exploration and Uncertainty-Aware Deployment Using Probabilistic Ensemble Neural Network Dynamics

In recent years, learning-based control in robotics has gained significa...
research
04/26/2023

FLEX: an Adaptive Exploration Algorithm for Nonlinear Systems

Model-based reinforcement learning is a powerful tool, but collecting da...

Please sign up or login with your details

Forgot password? Click here to reset