Bayesian Safety Validation for Black-Box Systems

by   Robert J. Moss, et al.

Accurately estimating the probability of failure for safety-critical systems is important for certification. Estimation is often challenging due to high-dimensional input spaces, dangerous test scenarios, and computationally expensive simulators; thus, efficient estimation techniques are important to study. This work reframes the problem of black-box safety validation as a Bayesian optimization problem and introduces an algorithm, Bayesian safety validation, that iteratively fits a probabilistic surrogate model to efficiently predict failures. The algorithm is designed to search for failures, compute the most-likely failure, and estimate the failure probability over an operating domain using importance sampling. We introduce a set of three acquisition functions that focus on reducing uncertainty by covering the design space, optimizing the analytically derived failure boundaries, and sampling the predicted failure regions. Mainly concerned with systems that only output a binary indication of failure, we show that our method also works well in cases where more output information is available. Results show that Bayesian safety validation achieves a better estimate of the probability of failure using orders of magnitude fewer samples and performs well across various safety validation metrics. We demonstrate the algorithm on three test problems with access to ground truth and on a real-world safety-critical subsystem common in autonomous flight: a neural network-based runway detection system. This work is open sourced and currently being used to supplement the FAA certification process of the machine learning components for an autonomous cargo aircraft.


page 1

page 6

page 9

page 10

page 13

page 14

page 15


A Survey of Algorithms for Black-Box Safety Validation

Autonomous and semi-autonomous systems for safety-critical applications ...

A Bayesian approach to breaking things: efficiently predicting and repairing failure modes via sampling

Before autonomous systems can be deployed in safety-critical application...

Model-based Validation as Probabilistic Inference

Estimating the distribution over failures is a key step in validating au...

Detecting and Mitigating Test-time Failure Risks via Model-agnostic Uncertainty Learning

Reliably predicting potential failure risks of machine learning (ML) sys...

Investigating the Failure Modes of the AUC metric and Exploring Alternatives for Evaluating Systems in Safety Critical Applications

With the increasing importance of safety requirements associated with th...

EEE, Remediating the failure of machine learning models via a network-based optimization patch

A network-based optimization approach, EEE, is proposed for the purpose ...

Predicting Model Failure using Saliency Maps in Autonomous Driving Systems

While machine learning systems show high success rate in many complex ta...

Please sign up or login with your details

Forgot password? Click here to reset