DropoutDAgger: A Bayesian Approach to Safe Imitation Learning

09/18/2017
by   Kunal Menda, et al.
0

While imitation learning is becoming common practice in robotics, this approach often suffers from data mismatch and compounding errors. DAgger is an iterative algorithm that addresses these issues by continually aggregating training data from both the expert and novice policies, but does not consider the impact of safety. We present a probabilistic extension to DAgger, which uses the distribution over actions provided by the novice policy, for a given observation. Our method, which we call DropoutDAgger, uses dropout to train the novice as a Bayesian neural network that provides insight to its confidence. Using the distribution over the novice's actions, we estimate a probabilistic measure of safety with respect to the expert action, tuned to balance exploration and exploitation. The utility of this approach is evaluated on the MuJoCo HalfCheetah and in a simple driving experiment, demonstrating improved performance and safety compared to other DAgger variants and classic imitation learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/22/2018

EnsembleDAgger: A Bayesian Approach to Safe Imitation Learning

While imitation learning is often used in robotics, this approach often ...
research
07/09/2021

Adversarial Mixture Density Networks: Learning to Drive Safely from Collision Data

Imitation learning has been widely used to learn control policies for au...
research
10/05/2018

HG-DAgger: Interactive Imitation Learning with Human Experts

Imitation learning has proven to be useful for many real-world problems,...
research
03/03/2022

Fail-Safe Generative Adversarial Imitation Learning

For flexible yet safe imitation learning (IL), we propose a modular appr...
research
02/18/2021

Closing the Closed-Loop Distribution Shift in Safe Imitation Learning

Commonly used optimization-based control strategies such as model-predic...
research
08/31/2018

Imitation Learning for Neural Morphological String Transduction

We employ imitation learning to train a neural transition-based string t...
research
01/22/2020

Safety Considerations in Deep Control Policies with Probabilistic Safety Barrier Certificates

Recent advances in Deep Machine Learning have shown promise in solving c...

Please sign up or login with your details

Forgot password? Click here to reset