Learning non-Markovian Decision-Making from State-only Sequences

06/27/2023
by   Aoyang Qin, et al.
0

Conventional imitation learning assumes access to the actions of demonstrators, but these motor signals are often non-observable in naturalistic settings. Additionally, sequential decision-making behaviors in these settings can deviate from the assumptions of a standard Markov Decision Process (MDP). To address these challenges, we explore deep generative modeling of state-only sequences with non-Markov Decision Process (nMDP), where the policy is an energy-based prior in the latent space of the state transition generator. We develop maximum likelihood estimation to achieve model-based imitation, which involves short-run MCMC sampling from the prior and importance sampling for the posterior. The learned model enables decision-making as inference: model-free policy execution is equivalent to prior sampling, model-based planning is posterior sampling initialized from the policy. We demonstrate the efficacy of the proposed method in a prototypical path planning task with non-Markovian constraints and show that the learned model exhibits strong performances in challenging domains from the MuJoCo suite.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/27/2023

Decision Making for Autonomous Vehicles

This paper is on decision making of autonomous vehicles for handling rou...
research
07/31/2014

MONEYBaRL: Exploiting pitcher decision-making using Reinforcement Learning

This manuscript uses machine learning techniques to exploit baseball pit...
research
06/10/2019

On the Optimality of Sparse Model-Based Planning for Markov Decision Processes

This work considers the sample complexity of obtaining an ϵ-optimal poli...
research
06/20/2012

Imitation Learning with a Value-Based Prior

The goal of imitation learning is for an apprentice to learn how to beha...
research
06/05/2020

A Meta-Bayesian Model of Intentional Visual Search

We propose a computational model of visual search that incorporates Baye...
research
05/24/2018

Inverse POMDP: Inferring What You Think from What You Do

Complex behaviors are often driven by an internal model, which integrate...
research
09/11/2019

Correlation Priors for Reinforcement Learning

Many decision-making problems naturally exhibit pronounced structures in...

Please sign up or login with your details

Forgot password? Click here to reset