Infinite Time Horizon Safety of Bayesian Neural Networks

11/04/2021
by   Mathias Lechner, et al.
0

Bayesian neural networks (BNNs) place distributions over the weights of a neural network to model uncertainty in the data and the network's prediction. We consider the problem of verifying safety when running a Bayesian neural network policy in a feedback loop with infinite time horizon systems. Compared to the existing sampling-based approaches, which are inapplicable to the infinite time horizon setting, we train a separate deterministic neural network that serves as an infinite time horizon safety certificate. In particular, we show that the certificate network guarantees the safety of the system over a subset of the BNN weight posterior's support. Our method first computes a safe weight set and then alters the BNN's weight posterior to reject samples outside this set. Moreover, we show how to extend our approach to a safe-exploration reinforcement learning setting, in order to avoid unsafe trajectories during the training of the policy. We evaluate our approach on a series of reinforcement learning benchmarks, including non-Lyapunovian safety specifications.

READ FULL TEXT
research
04/16/2021

Safe Exploration in Model-based Reinforcement Learning using Control Barrier Functions

This paper studies the problem of developing an approximate dynamic prog...
research
09/28/2022

Guiding Safe Exploration with Weakest Preconditions

In reinforcement learning for safety-critical settings, it is often desi...
research
02/26/2018

Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling

Recent advances in deep reinforcement learning have made significant str...
research
10/11/2022

Learning Control Policies for Stochastic Systems with Reach-avoid Guarantees

We study the problem of learning controllers for discrete-time non-linea...
research
09/09/2022

RASR: Risk-Averse Soft-Robust MDPs with EVaR and Entropic Risk

Prior work on safe Reinforcement Learning (RL) has studied risk-aversion...
research
09/17/2021

Solving infinite-horizon Dec-POMDPs using Finite State Controllers within JESP

This paper looks at solving collaborative planning problems formalized a...
research
01/19/2019

Towards Physically Safe Reinforcement Learning under Supervision

This paper addresses the question of how a previously available control ...

Please sign up or login with your details

Forgot password? Click here to reset