Imitation Learning as f-Divergence Minimization

05/30/2019
by   Liyiming Ke, et al.
9

We address the problem of imitation learning with multi-modal demonstrations. Instead of attempting to learn all modes, we argue that in many tasks it is sufficient to imitate any one of them. We show that the state-of-the-art methods such as GAIL and behavior cloning, due to their choice of loss function, often incorrectly interpolate between such modes. Our key insight is to minimize the right divergence between the learner and the expert state-action distributions, namely the reverse KL divergence or I-projection. We propose a general imitation learning framework for estimating and minimizing any f-Divergence. By plugging in different divergences, we are able to recover existing algorithms such as Behavior Cloning (Kullback-Leibler), GAIL (Jensen Shannon) and Dagger (Total Variation). Empirical results show that our approximate I-projection technique is able to imitate multi-modal behaviors more reliably than GAIL and behavior cloning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/02/2020

f-GAIL: Learning f-Divergence for Generative Adversarial Imitation Learning

Imitation learning (IL) aims to learn a policy from expert demonstration...
research
10/13/2017

Burn-In Demonstrations for Multi-Modal Imitation Learning

Recent work on imitation learning has generated policies that reproduce ...
research
05/19/2020

Triple-GAIL: A Multi-Modal Imitation Learning Framework with Generative Adversarial Nets

Generative adversarial imitation learning (GAIL) has shown promising res...
research
06/18/2020

Reparameterized Variational Divergence Minimization for Stable Imitation

While recent state-of-the-art results for adversarial imitation-learning...
research
05/06/2022

Diverse Imitation Learning via Self-Organizing Generative Models

Imitation learning is the task of replicating expert policy from demonst...
research
11/06/2019

A Divergence Minimization Perspective on Imitation Learning Methods

In many settings, it is desirable to learn decision-making and control p...
research
05/22/2018

Maximum Causal Tsallis Entropy Imitation Learning

In this paper, we propose a novel maximum causal Tsallis entropy (MCTE) ...

Please sign up or login with your details

Forgot password? Click here to reset