Runtime-Safety-Guided Policy Repair

08/17/2020
by   Weichao Zhou, et al.
0

We study the problem of policy repair for learning-based control policies in safety-critical settings. We consider an architecture where a high-performance learning-based control policy (e.g. one trained as a neural network) is paired with a model-based safety controller. The safety controller is endowed with the abilities to predict whether the trained policy will lead the system to an unsafe state, and take over control when necessary. While this architecture can provide added safety assurances, intermittent and frequent switching between the trained policy and the safety controller can result in undesirable behaviors and reduced performance. We propose to reduce or even eliminate control switching by `repairing' the trained policy based on runtime data produced by the safety controller in a way that deviates minimally from the original policy. The key idea behind our approach is the formulation of a trajectory optimization problem that allows the joint reasoning of policy update and safety constraints. Experimental results demonstrate that our approach is effective even when the system model in the safety controller is unknown and only approximated.

READ FULL TEXT
research
03/12/2023

Certifiably-correct Control Policies for Safe Learning and Adaptation in Assistive Robotics

Guaranteeing safety in human-centric applications is critical in robot l...
research
05/25/2019

Safe Reinforcement Learning via Online Shielding

Reinforcement learning is a promising approach to learning control polic...
research
11/15/2021

Joint Synthesis of Safety Certificate and Safe Control Policy using Constrained Reinforcement Learning

Safety is the major consideration in controlling complex dynamical syste...
research
07/14/2020

Using GSM SMS controller alarm Configurator to develop cost effective, intelligent fire safety system in a developing country

Electricity supply to facilities is essential, but can cause fires when ...
research
01/26/2023

A Robust Optimisation Perspective on Counterexample-Guided Repair of Neural Networks

Counterexample-guided repair aims at creating neural networks with mathe...
research
08/01/2019

Neural Simplex Architecture

We present the Neural Simplex Architecture (NSA), a new approach to runt...
research
08/09/2021

Neural Network Repair with Reachability Analysis

Safety is a critical concern for the next generation of autonomy that is...

Please sign up or login with your details

Forgot password? Click here to reset