Provable Hierarchical Imitation Learning via EM

10/07/2020
by   Zhiyu Zhang, et al.
0

Due to recent empirical successes, the options framework for hierarchical reinforcement learning is gaining increasing popularity. Rather than learning from rewards which suffers from the curse of dimensionality, we consider learning an options-type hierarchical policy from expert demonstrations. Such a problem is referred to as hierarchical imitation learning. Converting this problem to parameter inference in a latent variable model, we theoretically characterize the EM approach proposed by Daniel et al. (2016). The population level algorithm is analyzed as an intermediate step, which is nontrivial due to the samples being correlated. If the expert policy can be parameterized by a variant of the options framework, then under regularity conditions, we prove that the proposed algorithm converges with high probability to a norm ball around the true parameter. To our knowledge, this is the first performance guarantee for an hierarchical imitation learning algorithm that only observes primitive state-action pairs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/22/2021

Online Baum-Welch algorithm for Hierarchical Imitation Learning

The options framework for hierarchical reinforcement learning has increa...
research
06/10/2021

Adversarial Option-Aware Hierarchical Imitation Learning

It has been a challenge to learning skills for an agent from long-horizo...
research
10/05/2021

A Critique of Strictly Batch Imitation Learning

Recent work by Jarrett et al. attempts to frame the problem of offline i...
research
09/29/2018

Directed-Info GAIL: Learning Hierarchical Policies from Unsegmented Demonstrations using Directed Information

The use of imitation learning to learn a single policy for a complex tas...
research
06/22/2019

Learning Belief Representations for Imitation Learning in POMDPs

We consider the problem of imitation learning from expert demonstrations...
research
10/15/2017

DDCO: Discovery of Deep Continuous Options for Robot Learning from Demonstrations

An option is a short-term skill consisting of a control policy for a spe...
research
12/29/2019

Hierarchical Variational Imitation Learning of Control Programs

Autonomous agents can learn by imitating teacher demonstrations of the i...

Please sign up or login with your details

Forgot password? Click here to reset