Safe Inverse Reinforcement Learning via Control Barrier Function

by   Yue Yang, et al.

Learning from Demonstration (LfD) is a powerful method for enabling robots to perform novel tasks as it is often more tractable for a non-roboticist end-user to demonstrate the desired skill and for the robot to efficiently learn from the associated data than for a human to engineer a reward function for the robot to learn the skill via reinforcement learning (RL). Safety issues arise in modern LfD techniques, e.g., Inverse Reinforcement Learning (IRL), just as they do for RL; yet, safe learning in LfD has received little attention. In the context of agile robots, safety is especially vital due to the possibility of robot-environment collision, robot-human collision, and damage to the robot. In this paper, we propose a safe IRL framework, CBFIRL, that leverages the Control Barrier Function (CBF) to enhance the safety of the IRL policy. The core idea of CBFIRL is to combine a loss function inspired by CBF requirements with the objective in an IRL method, both of which are jointly optimized via gradient descent. In the experiments, we show our framework performs safer compared to IRL methods without CBF, that is ∼15% and ∼20% improvement for two levels of difficulty of a 2D racecar domain and ∼ 50% improvement for a 3D drone domain.


Reinforcement Learning for Safe Robot Control using Control Lyapunov Barrier Functions

Reinforcement learning (RL) exhibits impressive performance when managin...

Reachability-based Trajectory Safeguard (RTS): A Safe and Fast Reinforcement Learning Safety Layer for Continuous Control

Reinforcement Learning (RL) algorithms have achieved remarkable performa...

Lyapunov Barrier Policy Optimization

Deploying Reinforcement Learning (RL) agents in the real-world require t...

Learning Observation-Based Certifiable Safe Policy for Decentralized Multi-Robot Navigation

Safety is of great importance in multi-robot navigation problems. In thi...

Safe Reinforcement Learning for Grid Voltage Control

Under voltage load shedding has been considered as a standard approach t...

Learning to Play Table Tennis From Scratch using Muscular Robots

Dynamic tasks like table tennis are relatively easy to learn for humans ...

Show me what you want: Inverse reinforcement learning to automatically design robot swarms by demonstration

Automatic design is a promising approach to generating control software ...

Please sign up or login with your details

Forgot password? Click here to reset