Latent Policies for Adversarial Imitation Learning

by   Tianyu Wang, et al.

This paper considers learning robot locomotion and manipulation tasks from expert demonstrations. Generative adversarial imitation learning (GAIL) trains a discriminator that distinguishes expert from agent transitions, and in turn use a reward defined by the discriminator output to optimize a policy generator for the agent. This generative adversarial training approach is very powerful but depends on a delicate balance between the discriminator and the generator training. In high-dimensional problems, the discriminator training may easily overfit or exploit associations with task-irrelevant features for transition classification. A key insight of this work is that performing imitation learning in a suitable latent task space makes the training process stable, even in challenging high-dimensional problems. We use an action encoder-decoder model to obtain a low-dimensional latent action space and train a LAtent Policy using Adversarial imitation Learning (LAPAL). The encoder-decoder model can be trained offline from state-action pairs to obtain a task-agnostic latent action representation or online, simultaneously with the discriminator and generator training, to obtain a task-aware latent action representation. We demonstrate that LAPAL training is stable, with near-monotonic performance improvement, and achieves expert performance in most locomotion and manipulation tasks, while a GAIL baseline converges slower and does not achieve expert performance in high-dimensional environments.


page 2

page 6

page 8


Task-Relevant Adversarial Imitation Learning

We show that a critical problem in adversarial imitation from high-dimen...

Goal-Aware Generative Adversarial Imitation Learning from Imperfect Demonstration for Robotic Cloth Manipulation

Generative Adversarial Imitation Learning (GAIL) can learn policies with...

Expert-Level Atari Imitation Learning from Demonstrations Only

One of the key issues for imitation learning lies in making policy learn...

GAIL-PT: A Generic Intelligent Penetration Testing Framework with Generative Adversarial Imitation Learning

Penetration testing (PT) is an efficient network testing and vulnerabili...

Mature GAIL: Imitation Learning for Low-level and High-dimensional Input using Global Encoder and Cost Transformation

Recently, GAIL framework and various variants have shown remarkable poss...

Domain-Robust Visual Imitation Learning with Mutual Information Constraints

Human beings are able to understand objectives and learn by simply obser...

Robust Generative Adversarial Imitation Learning via Local Lipschitzness

We explore methodologies to improve the robustness of generative adversa...

Please sign up or login with your details

Forgot password? Click here to reset