Safe Value Functions

The relationship between safety and optimality in control is not well understood, and they are often seen as important yet conflicting objectives. There is a pressing need to formalize this relationship, especially given the growing prominence of learning-based methods. Indeed, it is common practice in reinforcement learning to simply modify reward functions by penalizing failures, with the penalty treated as a mere heuristic. We rigorously examine this relationship, and formalize the requirements for safe value functions: value functions that are both optimal for a given task, and enforce safety. We reveal the structure of this relationship through a proof of strong duality, showing that there always exists a finite penalty that induces a safe value function. This penalty is not unique, but upper-unbounded: larger penalties do not harm optimality. Although it is often not possible to compute the minimum required penalty, we reveal clear structure of how the penalty, rewards, discount factor, and dynamics interact. This insight suggests practical, theory-guided heuristics to design reward functions for control problems where safety is important.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 7

page 11

research
05/31/2023

ROSARL: Reward-Only Safe Reinforcement Learning

An important problem in reinforcement learning is designing agents that ...
research
06/06/2023

Value Functions are Control Barrier Functions: Verification of Safe Policies using Control Theory

Guaranteeing safe behaviour of reinforcement learning (RL) policies pose...
research
11/29/2022

Interpreting Primal-Dual Algorithms for Constrained MARL

Constrained multiagent reinforcement learning (C-MARL) is gaining import...
research
12/22/2020

Dynamic penalty function approach for constraints handling in reinforcement learning

Reinforcement learning (RL) is attracting attentions as an effective way...
research
05/30/2021

Safety Embedded Differential Dynamic Programming using Discrete Barrier States

Certified safe control is a growing challenge in robotics, especially wh...
research
07/22/2013

Kinetic Energy Plus Penalty Functions for Sparse Estimation

In this paper we propose and study a family of sparsity-inducing penalty...
research
10/20/2021

Exploring the Relationship Between "Positive Risk Balance" and "Absence of Unreasonable Risk"

International discussions on the overarching topic of how to define and ...

Please sign up or login with your details

Forgot password? Click here to reset