Interpretable Policy Specification and Synthesis through Natural Language and RL

01/18/2021
by   Pradyumna Tambwekar, et al.
0

Policy specification is a process by which a human can initialize a robot's behaviour and, in turn, warm-start policy optimization via Reinforcement Learning (RL). While policy specification/design is inherently a collaborative process, modern methods based on Learning from Demonstration or Deep RL lack the model interpretability and accessibility to be classified as such. Current state-of-the-art methods for policy specification rely on black-box models, which are an insufficient means of collaboration for non-expert users: These models provide no means of inspecting policies learnt by the agent and are not focused on creating a usable modality for teaching robot behaviour. In this paper, we propose a novel machine learning framework that enables humans to 1) specify, through natural language, interpretable policies in the form of easy-to-understand decision trees, 2) leverage these policies to warm-start reinforcement learning and 3) outperform baselines that lack our natural language initialization mechanism. We train our approach by collecting a first-of-its-kind corpus mapping free-form natural language policy descriptions to decision tree-based policies. We show that our novel framework translates natural language to decision trees with a 96 corpus across two domains, respectively. Finally, we validate that policies initialized with natural language commands are able to significantly outperform relevant baselines (p < 0.001) that do not benefit from our natural language-based warm-start technique.

READ FULL TEXT
research
02/04/2022

Learning Interpretable, High-Performing Policies for Continuous Control Problems

Gradient-based approaches in reinforcement learning (RL) have achieved t...
research
09/19/2022

MSVIPER: Improved Policy Distillation for Reinforcement-Learning-Based Robot Navigation

We present Multiple Scenario Verifiable Reinforcement Learning via Polic...
research
07/02/2019

Conservative Q-Improvement: Reinforcement Learning for an Interpretable Decision-Tree Policy

There is a growing desire in the field of reinforcement learning (and ma...
research
04/13/2023

Language Instructed Reinforcement Learning for Human-AI Coordination

One of the fundamental quests of AI is to produce agents that coordinate...
research
06/16/2018

Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents

We investigate the task of learning to follow natural language instructi...
research
04/13/2022

COCTEAU: an Empathy-Based Tool for Decision-Making

Traditional approaches to data-informed policymaking are often tailored ...
research
08/23/2022

What deep reinforcement learning tells us about human motor learning and vice-versa

Machine learning and specifically reinforcement learning (RL) has been e...

Please sign up or login with your details

Forgot password? Click here to reset