Sample-efficient Safe Learning for Online Nonlinear Control with Control Barrier Functions

07/29/2022
by   Wenhao Luo, et al.
6

Reinforcement Learning (RL) and continuous nonlinear control have been successfully deployed in multiple domains of complicated sequential decision-making tasks. However, given the exploration nature of the learning process and the presence of model uncertainty, it is challenging to apply them to safety-critical control tasks due to the lack of safety guarantee. On the other hand, while combining control-theoretical approaches with learning algorithms has shown promise in safe RL applications, the sample efficiency of safe data collection process for control is not well addressed. In this paper, we propose a provably sample efficient episodic safe learning framework for online control tasks that leverages safe exploration and exploitation in an unknown, nonlinear dynamical system. In particular, the framework 1) extends control barrier functions (CBFs) in a stochastic setting to achieve provable high-probability safety under uncertainty during model learning and 2) integrates an optimism-based exploration strategy to efficiently guide the safe exploration process with learned dynamics for near optimal control performance. We provide formal analysis on the episodic regret bound against the optimal controller and probabilistic safety with theoretical guarantees. Simulation results are provided to demonstrate the effectiveness and efficiency of the proposed algorithm.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/21/2019

End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks

Reinforcement Learning (RL) algorithms have found limited success beyond...
research
08/26/2020

Safe Model-Based Meta-Reinforcement Learning: A Sequential Exploration-Exploitation Framework

Safe deployment of autonomous robots in diverse environments requires ag...
research
01/31/2023

Optimal Transport Perturbations for Safe Reinforcement Learning with Robustness Guarantees

Robustness and safety are critical for the trustworthy deployment of dee...
research
05/09/2020

Chance-Constrained Trajectory Optimization for Safe Exploration and Learning of Nonlinear Systems

Learning-based control algorithms require collection of abundant supervi...
research
04/07/2020

Learning Control Barrier Functions from Expert Demonstrations

Inspired by the success of imitation and inverse reinforcement learning ...
research
07/07/2020

Provably Safe PAC-MDP Exploration Using Analogies

A key challenge in applying reinforcement learning to safety-critical do...
research
11/09/2018

Reachability-based safe learning for optimal control problem

In this work we seek for an approach to integrate safety in the learning...

Please sign up or login with your details

Forgot password? Click here to reset