Falsification-Based Robust Adversarial Reinforcement Learning

by   Xiao Wang, et al.
Technische Universität München

Reinforcement learning (RL) has achieved tremendous progress in solving various sequential decision-making problems, e.g., control tasks in robotics. However, RL methods often fail to generalize to safety-critical scenarios since policies are overfitted to training environments. Previously, robust adversarial reinforcement learning (RARL) was proposed to train an adversarial network that applies disturbances to a system, which improves robustness in test scenarios. A drawback of neural-network-based adversaries is that integrating system requirements without handcrafting sophisticated reward signals is difficult. Safety falsification methods allow one to find a set of initial conditions as well as an input sequence, such that the system violates a given property formulated in temporal logic. In this paper, we propose falsification-based RARL (FRARL), the first generic framework for integrating temporal-logic falsification in adversarial learning to improve policy robustness. With falsification method, we do not need to construct an extra reward function for the adversary. We evaluate our approach on a braking assistance system and an adaptive cruise control system of autonomous vehicles. Experiments show that policies trained with a falsification-based adversary generalize better and show less violation of the safety specification in test scenarios than the ones trained without an adversary or with an adversarial network.


page 1

page 2

page 3

page 4


Robust Adversarial Reinforcement Learning

Deep neural networks coupled with fast simulation and improved computati...

Reinforcement Learning Agent Training with Goals for Real World Tasks

Reinforcement Learning (RL) is a promising approach for solving various ...

Training Adversarial Agents to Exploit Weaknesses in Deep Control Policies

Deep learning has become an increasingly common technique for various co...

Prior Is All You Need to Improve the Robustness and Safety for the First Time Deployment of Meta RL

The field of Meta Reinforcement Learning (Meta-RL) has seen substantial ...

ARC: Adversarially Robust Control Policies for Autonomous Vehicles

Deep neural networks have demonstrated their capability to learn control...

Action Robust Reinforcement Learning and Applications in Continuous Control

A policy is said to be robust if it maximizes the reward while consideri...

On Assessing The Safety of Reinforcement Learning algorithms Using Formal Methods

The increasing adoption of Reinforcement Learning in safety-critical sys...

Please sign up or login with your details

Forgot password? Click here to reset