Follow The Rules: Online Signal Temporal Logic Tree Search for Guided Imitation Learning in Stochastic Domains

09/27/2022
by   Jasmine Jerry Aloor, et al.
0

Seamlessly integrating rules in Learning-from-Demonstrations (LfD) policies is a critical requirement to enable the real-world deployment of AI agents. Recently Signal Temporal Logic (STL) has been shown to be an effective language for encoding rules as spatio-temporal constraints. This work uses Monte Carlo Tree Search (MCTS) as a means of integrating STL specification into a vanilla LfD policy to improve constraint satisfaction. We propose augmenting the MCTS heuristic with STL robustness values to bias the tree search towards branches with higher constraint satisfaction. While the domain-independent method can be applied to integrate STL rules online into any pre-trained LfD algorithm, we choose goal-conditioned Generative Adversarial Imitation Learning as the offline LfD policy. We apply the proposed method to the domain of planning trajectories for General Aviation aircraft around a non-towered airfield. Results using the simulator trained on real-world data showcase 60 performance over baseline LfD methods that do not use STL heuristics.

READ FULL TEXT

page 1

page 3

page 4

page 6

research
02/04/2022

Semi-Supervised Trajectory-Feedback Controller Synthesis for Signal Temporal Logic Specifications

There are spatio-temporal rules that dictate how robots should operate i...
research
03/03/2019

End-to-end Driving Deploying through Uncertainty-Aware Imitation Learning and Stochastic Visual Domain Adaptation

End-to-end visual-based imitation learning has been widely applied in au...
research
06/12/2018

Model-Based Imitation Learning with Accelerated Convergence

Sample efficiency is critical in solving real-world reinforcement learni...
research
04/03/2018

Learning to Search via Self-Imitation

We study the problem of learning a good search policy. To do so, we prop...
research
09/23/2021

Semi-Supervised Imitation Learning with Mixed Qualities of Demonstrations for Autonomous Driving

In this paper, we consider the problem of autonomous driving using imita...
research
07/28/2021

Monte Carlo Tree Search for high precision manufacturing

Monte Carlo Tree Search (MCTS) has shown its strength for a lot of deter...
research
09/17/2022

Constrained Policy Optimization for Controlled Self-Learning in Conversational AI Systems

Recently, self-learning methods based on user satisfaction metrics and c...

Please sign up or login with your details

Forgot password? Click here to reset