Learning Options from Demonstration using Skill Segmentation

01/19/2020
by   Matthew Cockcroft, et al.
13

We present a method for learning options from segmented demonstration trajectories. The trajectories are first segmented into skills using nonparametric Bayesian clustering and a reward function for each segment is then learned using inverse reinforcement learning. From this, a set of inferred trajectories for the demonstration are generated. Option initiation sets and termination conditions are learned from these trajectories using the one-class support vector machine clustering algorithm. We demonstrate our method in the four rooms domain, where an agent is able to autonomously discover usable options from human demonstration. Our results show that these inferred options can then be used to improve learning and planning.

READ FULL TEXT
research
11/07/2019

Option Compatible Reward Inverse Reinforcement Learning

Reinforcement learning with complex tasks is a challenging problem. Ofte...
research
11/01/2019

PODNet: A Neural Network for Discovery of Plannable Options

Learning from demonstration has been widely studied in machine learning ...
research
12/01/2018

Discovering hierarchies using Imitation Learning from hierarchy aware policies

Learning options that allow agents to exhibit temporally higher order be...
research
02/09/2022

Bayesian Nonparametrics for Offline Skill Discovery

Skills or low-level policies in reinforcement learning are temporally ex...
research
03/03/2016

Object Manipulation Learning by Imitation

We aim to enable robot to learn object manipulation by imitation. Given ...
research
11/22/2016

Variational Intrinsic Control

In this paper we introduce a new unsupervised reinforcement learning met...
research
10/08/2021

Towards Sample-efficient Apprenticeship Learning from Suboptimal Demonstration

Learning from Demonstration (LfD) seeks to democratize robotics by enabl...

Please sign up or login with your details

Forgot password? Click here to reset