Optimism is All You Need: Model-Based Imitation Learning From Observation Alone

02/22/2021
by   Rahul Kidambi, et al.
13

This paper studies Imitation Learning from Observations alone (ILFO) where the learner is presented with expert demonstrations that only consist of states encountered by an expert (without access to actions taken by the expert). We present a provably efficient model-based framework MobILE to solve the ILFO problem. MobILE involves carefully trading off exploration against imitation - this is achieved by integrating the idea of optimism in the face of uncertainty into the distribution matching imitation learning (IL) framework. We provide a unified analysis for MobILE, and demonstrate that MobILE enjoys strong performance guarantees for classes of MDP dynamics that satisfy certain well studied notions of complexity. We also show that the ILFO problem is strictly harder than the standard IL problem by reducing ILFO to a multi-armed bandit problem indicating that exploration is necessary for ILFO. We complement these theoretical results with experimental simulations on benchmark OpenAI Gym tasks that indicate the efficacy of MobILE.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/27/2019

Provably Efficient Imitation Learning from Observation Alone

We study Imitation Learning (IL) from Observations alone (ILFO) in large...
research
06/17/2021

Seeing Differently, Acting Similarly: Imitation Learning with Heterogeneous Observations

In many real-world imitation learning tasks, the demonstrator and the le...
research
12/30/2022

Learning from Guided Play: Improving Exploration for Adversarial Imitation Learning with Simple Auxiliary Tasks

Adversarial imitation learning (AIL) has become a popular alternative to...
research
04/03/2021

No Need for Interactions: Robust Model-Based Imitation Learning using Neural ODE

Interactions with either environments or expert policies during training...
research
08/03/2022

Sequence Model Imitation Learning with Unobserved Contexts

We consider imitation learning problems where the expert has access to a...
research
10/10/2019

Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement

This paper studies Learning from Observations (LfO) for imitation learni...
research
03/04/2021

Of Moments and Matching: Trade-offs and Treatments in Imitation Learning

We provide a unifying view of a large family of previous imitation learn...

Please sign up or login with your details

Forgot password? Click here to reset