Guided Policy Search for Parameterized Skills using Adverbs

10/23/2021
by   Benjamin A. Spiegel, et al.
0

We present a method for using adverb phrases to adjust skill parameters via learned adverb-skill groundings. These groundings allow an agent to use adverb feedback provided by a human to directly update a skill policy, in a manner similar to traditional local policy search methods. We show that our method can be used as a drop-in replacement for these policy search methods when dense reward from the environment is not available but human language feedback is. We demonstrate improved sample efficiency over modern policy search methods in two experiments.

READ FULL TEXT
research
10/14/2022

Skill-Based Reinforcement Learning with Intrinsic Reward Matching

While unsupervised skill discovery has shown promise in autonomously acq...
research
06/27/2012

Learning Parameterized Skills

We introduce a method for constructing skills capable of solving tasks d...
research
01/19/2023

Keyframe Demonstration Seeded and Bayesian Optimized Policy Search

This paper introduces a novel Learning from Demonstration framework to l...
research
04/08/2021

Learning What To Do by Simulating the Past

Since reward functions are hard to specify, recent work has focused on l...
research
06/07/2022

Meta-Learning Transferable Parameterized Skills

We propose a novel parameterized skill-learning algorithm that aims to l...
research
05/19/2018

Autonomous discovery of the goal space to learn a parameterized skill

A parameterized skill is a mapping from multiple goals/task parameters t...
research
10/28/2021

Wasserstein Distance Maximizing Intrinsic Control

This paper deals with the problem of learning a skill-conditioned policy...

Please sign up or login with your details

Forgot password? Click here to reset