Critic Sequential Monte Carlo

05/30/2022
by   Vasileios Lioutas, et al.
7

We introduce CriticSMC, a new algorithm for planning as inference built from a novel composition of sequential Monte Carlo with learned soft-Q function heuristic factors. This algorithm is structured so as to allow using large numbers of putative particles leading to efficient utilization of computational resource and effective discovery of high reward trajectories even in environments with difficult reward surfaces such as those arising from hard constraints. Relative to prior art our approach is notably still compatible with model-free reinforcement learning in the sense that the implicit policy we produce can be used at test time in the absence of a world model. Our experiments on self-driving car collision avoidance in simulation demonstrate improvements against baselines in terms of infraction minimization relative to computational effort while maintaining diversity and realism of found trajectories.

READ FULL TEXT
research
06/30/2020

Simple conditions for convergence of sequential Monte Carlo genealogies with applications

Sequential Monte Carlo algorithms are popular methods for approximating ...
research
03/28/2017

Factoring Exogenous State for Model-Free Monte Carlo

Policy analysts wish to visualize a range of policies for large simulato...
research
02/22/2016

Inference Networks for Sequential Monte Carlo in Graphical Models

We introduce a new approach for amortizing inference in directed graphic...
research
04/25/2023

Fulfilling Formal Specifications ASAP by Model-free Reinforcement Learning

We propose a model-free reinforcement learning solution, namely the ASAP...
research
10/07/2022

Inferring Smooth Control: Monte Carlo Posterior Policy Iteration with Gaussian Processes

Monte Carlo methods have become increasingly relevant for control of non...
research
08/27/2019

Proactive Intention Recognition for Joint Human-Robot Search and Rescue Missions through Monte-Carlo Planning in POMDP Environments

Proactively perceiving others' intentions is a crucial skill to effectiv...

Please sign up or login with your details

Forgot password? Click here to reset