DeepAI AI Chat
Log In Sign Up

Reinforcement Learning Agent Training with Goals for Real World Tasks

by   Xuan Zhao, et al.

Reinforcement Learning (RL) is a promising approach for solving various control, optimization, and sequential decision making tasks. However, designing reward functions for complex tasks (e.g., with multiple objectives and safety constraints) can be challenging for most users and usually requires multiple expensive trials (reward function hacking). In this paper we propose a specification language (Inkling Goal Specification) for complex control and optimization tasks, which is very close to natural language and allows a practitioner to focus on problem specification instead of reward function hacking. The core elements of our framework are: (i) mapping the high level language to a predicate temporal logic tailored to control and optimization tasks, (ii) a novel automaton-guided dense reward generation that can be used to drive RL algorithms, and (iii) a set of performance metrics to assess the behavior of the system. We include a set of experiments showing that the proposed method provides great ease of use to specify a wide range of real world tasks; and that the reward generated is able to drive the policy training to achieve the specified goal.


page 1

page 2

page 3

page 4


A Composable Specification Language for Reinforcement Learning Tasks

Reinforcement learning is a promising approach for learning control poli...

Direct Behavior Specification via Constrained Reinforcement Learning

The standard formulation of Reinforcement Learning lacks a practical way...

A Policy Search Method For Temporal Logic Specified Reinforcement Learning Tasks

Reward engineering is an important aspect of reinforcement learning. Whe...

Assured Learning-enabled Autonomy: A Metacognitive Reinforcement Learning Framework

Reinforcement learning (RL) agents with pre-specified reward functions c...

Generalizing Skills with Semi-Supervised Reinforcement Learning

Deep reinforcement learning (RL) can acquire complex behaviors from low-...

Falsification-Based Robust Adversarial Reinforcement Learning

Reinforcement learning (RL) has achieved tremendous progress in solving ...

Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot Locomotion

Deep reinforcement learning (RL) uses model-free techniques to optimize ...