Out-of-Dynamics Imitation Learning from Multimodal Demonstrations

by   Yiwen Qiu, et al.

Existing imitation learning works mainly assume that the demonstrator who collects demonstrations shares the same dynamics as the imitator. However, the assumption limits the usage of imitation learning, especially when collecting demonstrations for the imitator is difficult. In this paper, we study out-of-dynamics imitation learning (OOD-IL), which relaxes the assumption to that the demonstrator and the imitator have the same state spaces but could have different action spaces and dynamics. OOD-IL enables imitation learning to utilize demonstrations from a wide range of demonstrators but introduces a new challenge: some demonstrations cannot be achieved by the imitator due to the different dynamics. Prior works try to filter out such demonstrations by feasibility measurements, but ignore the fact that the demonstrations exhibit a multimodal distribution since the different demonstrators may take different policies in different dynamics. We develop a better transferability measurement to tackle this newly-emerged challenge. We firstly design a novel sequence-based contrastive clustering algorithm to cluster demonstrations from the same mode to avoid the mutual interference of demonstrations from different modes, and then learn the transferability of each demonstration with an adversarial-learning based algorithm in each cluster. Experiment results on several MuJoCo environments, a driving environment, and a simulated robot environment show that the proposed transferability measurement more accurately finds and down-weights non-transferable demonstrations and outperforms prior works on the final imitation learning performance. We show the videos of our experiment results on our website.


page 2

page 6


Learning from Imperfect Demonstrations from Agents with Varying Dynamics

Imitation learning enables robots to learn from demonstrations. Previous...

Masked Imitation Learning: Discovering Environment-Invariant Modalities in Multimodal Demonstrations

Multimodal demonstrations provide robots with an abundance of informatio...

Let Me Check the Examples: Enhancing Demonstration Learning via Explicit Imitation

Demonstration learning aims to guide the prompt prediction via providing...

Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality

Most existing imitation learning approaches assume the demonstrations ar...

ADAIL: Adaptive Adversarial Imitation Learning

We present the ADaptive Adversarial Imitation Learning (ADAIL) algorithm...

Learning Feasibility to Imitate Demonstrators with Different Dynamics

The goal of learning from demonstrations is to learn a policy for an age...

Explaining Imitation Learning through Frames

As one of the prevalent methods to achieve automation systems, Imitation...

Please sign up or login with your details

Forgot password? Click here to reset