On the Value of Myopic Behavior in Policy Reuse

05/28/2023
by   Kang Xu, et al.
0

Leveraging learned strategies in unfamiliar scenarios is fundamental to human intelligence. In reinforcement learning, rationally reusing the policies acquired from other tasks or human experts is critical for tackling problems that are difficult to learn from scratch. In this work, we present a framework called Selective Myopic bEhavior Control (SMEC), which results from the insight that the short-term behaviors of prior policies are sharable across tasks. By evaluating the behaviors of prior policies via a hybrid value function architecture, SMEC adaptively aggregates the sharable short-term behaviors of prior policies and the long-term behaviors of the task policy, leading to coordinated decisions. Empirical results on a collection of manipulation and locomotion tasks demonstrate that SMEC outperforms existing methods, and validate the ability of SMEC to leverage related prior policies.

READ FULL TEXT
research
10/07/2019

Policies Modulating Trajectory Generators

We propose an architecture for learning complex controllable behaviors b...
research
06/03/2021

Lifetime policy reuse and the importance of task capacity

A long-standing challenge in artificial intelligence is lifelong learnin...
research
10/07/2021

Augmenting Reinforcement Learning with Behavior Primitives for Diverse Manipulation Tasks

Realistic manipulation tasks require a robot to interact with an environ...
research
09/30/2022

Efficiently Learning Small Policies for Locomotion and Manipulation

Neural control of memory-constrained, agile robots requires small, yet h...
research
09/10/2016

Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks

We consider scenarios from the real-time strategy game StarCraft as new ...
research
05/01/2015

Bayesian Policy Reuse

A long-lived autonomous agent should be able to respond online to novel ...
research
04/05/2021

FABRIC: A Framework for the Design and Evaluation of Collaborative Robots with Extended Human Adaptation

A limitation for collaborative robots (cobots) is their lack of ability ...

Please sign up or login with your details

Forgot password? Click here to reset