Interactively Teaching an Inverse Reinforcement Learner with Limited Feedback

09/16/2023
by   Rustam Zayanov, et al.
0

We study the problem of teaching via demonstrations in sequential decision-making tasks. In particular, we focus on the situation when the teacher has no access to the learner's model and policy, and the feedback from the learner is limited to trajectories that start from states selected by the teacher. The necessity to select the starting states and infer the learner's policy creates an opportunity for using the methods of inverse reinforcement learning and active learning by the teacher. In this work, we formalize the teaching process with limited feedback and propose an algorithm that solves this teaching problem. The algorithm uses a modified version of the active value-at-risk method to select the starting states, a modified maximum causal entropy algorithm to infer the policy, and the difficulty score ratio method to choose the teaching demonstrations. We test the algorithm in a synthetic car driving environment and conclude that the proposed algorithm is an effective solution when the learner's feedback is limited.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/28/2019

Interactive Teaching Algorithms for Inverse Reinforcement Learning

We study the problem of inverse reinforcement learning (IRL) with the ad...
research
05/20/2018

Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications

Inverse reinforcement learning (IRL) infers a reward function from demon...
research
09/05/2020

Using Machine Teaching to Investigate Human Assumptions when Teaching Reinforcement Learners

Successful teaching requires an assumption of how the learner learns - h...
research
06/08/2021

Curriculum Design for Teaching via Demonstrations: Theory and Applications

We consider the problem of teaching via demonstrations in sequential dec...
research
10/16/2012

Dynamic Teaching in Sequential Decision Making Environments

We describe theoretical bounds and a practical algorithm for teaching a ...
research
10/10/2019

Manifold learning from a teacher's demonstrations

We consider the problem of manifold learning. Extending existing approac...
research
01/21/2017

Interactive Learning from Policy-Dependent Human Feedback

For agents and robots to become more useful, they must be able to quickl...

Please sign up or login with your details

Forgot password? Click here to reset