Eliciting Compatible Demonstrations for Multi-Human Imitation Learning

by   Kanishk Gandhi, et al.

Imitation learning from human-provided demonstrations is a strong approach for learning policies for robot manipulation. While the ideal dataset for imitation learning is homogenous and low-variance – reflecting a single, optimal method for performing a task – natural human behavior has a great deal of heterogeneity, with several optimal ways to demonstrate a task. This multimodality is inconsequential to human users, with task variations manifesting as subconscious choices; for example, reaching down, then across to grasp an object, versus reaching across, then down. Yet, this mismatch presents a problem for interactive imitation learning, where sequences of users improve on a policy by iteratively collecting new, possibly conflicting demonstrations. To combat this problem of demonstrator incompatibility, this work designs an approach for 1) measuring the compatibility of a new demonstration given a base policy, and 2) actively eliciting more compatible demonstrations from new users. Across two simulation tasks requiring long-horizon, dexterous manipulation and a real-world "food plating" task with a Franka Emika Panda arm, we show that we can both identify incompatible demonstrations via post-hoc filtering, and apply our compatibility measure to actively elicit compatible demonstrations from new users, leading to improved task success rates across simulated and real environments.


page 2

page 6


Bottom-Up Skill Discovery from Unsegmented Demonstrations for Long-Horizon Robot Manipulation

We tackle real-world long-horizon robot manipulation tasks through skill...

Ergodic imitation: Learning from what to do and what not to do

With growing access to versatile robotics, it is beneficial for end user...

Bridging Action Space Mismatch in Learning from Demonstrations

Learning from demonstrations (LfD) methods guide learning agents to a de...

State Representation Learning from Demonstration

In a context where several policies can be observed as black boxes on di...

Interactive Imitation Learning of Bimanual Movement Primitives

Performing bimanual tasks with dual robotic setups can drastically incre...

Bayesian Disturbance Injection: Robust Imitation Learning of Flexible Policies for Robot Manipulation

Humans demonstrate a variety of interesting behavioral characteristics w...

Learning a Behavioral Repertoire from Demonstrations

Imitation Learning (IL) is a machine learning approach to learn a policy...

Please sign up or login with your details

Forgot password? Click here to reset