Model-Free Reinforcement Learning for Symbolic Automata-encoded Objectives

02/04/2022
by   Anand Balakrishnan, et al.
0

Reinforcement learning (RL) is a popular approach for robotic path planning in uncertain environments. However, the control policies trained for an RL agent crucially depend on user-defined, state-based reward functions. Poorly designed rewards can lead to policies that do get maximal rewards but fail to satisfy desired task objectives or are unsafe. There are several examples of the use of formal languages such as temporal logics and automata to specify high-level task specifications for robots (in lieu of Markovian rewards). Recent efforts have focused on inferring state-based rewards from formal specifications; here, the goal is to provide (probabilistic) guarantees that the policy learned using RL (with the inferred rewards) satisfies the high-level formal specification. A key drawback of several of these techniques is that the rewards that they infer are sparse: the agent receives positive rewards only upon completion of the task and no rewards otherwise. This naturally leads to poor convergence properties and high variance during RL. In this work, we propose using formal specifications in the form of symbolic automata: these serve as a generalization of both bounded-time temporal logic-based specifications as well as automata. Furthermore, our use of symbolic automata allows us to define non-sparse potential-based rewards which empirically shape the reward surface, leading to better convergence during RL. We also show that our potential-based rewarding strategy still allows us to obtain the policy that maximizes the satisfaction of the given specification.

READ FULL TEXT
research
11/10/2020

Model-based Reinforcement Learning from Signal Temporal Logic Specifications

Techniques based on Reinforcement Learning (RL) are increasingly being u...
research
05/02/2023

Sample Efficient Model-free Reinforcement Learning from LTL Specifications with Optimality Guarantees

Linear Temporal Logic (LTL) is widely used to specify high-level objecti...
research
10/14/2020

Reinforcement Learning Based Temporal Logic Control with Maximum Probabilistic Satisfaction

This paper presents a model-free reinforcement learning (RL) algorithm t...
research
10/03/2022

Learning Minimally-Violating Continuous Control for Infeasible Linear Temporal Logic Specifications

This paper explores continuous-time control synthesis for target-driven ...
research
09/09/2023

Verifiable Reinforcement Learning Systems via Compositionality

We propose a framework for verifiable and compositional reinforcement le...
research
12/01/2017

A double competitive strategy based learning automata algorithm

Learning Automata (LA) are considered as one of the most powerful tools ...
research
09/08/2020

Induction and Exploitation of Subgoal Automata for Reinforcement Learning

In this paper we present ISA, an approach for learning and exploiting su...

Please sign up or login with your details

Forgot password? Click here to reset