RAPID: A Reachable Anytime Planner for Imprecisely-sensed Domains

03/15/2012
by   Emma Brunskill, et al.
0

Despite the intractability of generic optimal partially observable Markov decision process planning, there exist important problems that have highly structured models. Previous researchers have used this insight to construct more efficient algorithms for factored domains, and for domains with topological structure in the flat state dynamics model. In our work, motivated by findings from the education community relevant to automated tutoring, we consider problems that exhibit a form of topological structure in the factored dynamics model. Our Reachable Anytime Planner for Imprecisely-sensed Domains (RAPID) leverages this structure to efficiently compute a good initial envelope of reachable states under the optimal MDP policy in time linear in the number of state variables. RAPID performs partially-observable planning over the limited envelope of states, and slowly expands the state space considered as time allows. RAPID performs well on a large tutoring-inspired problem simulation with 122 state variables, corresponding to a flat state space of over 10^30 states.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/28/2020

Improving Automated Driving through Planning with Human Internal States

This work examines the hypothesis that partially observable Markov decis...
research
02/06/2013

Region-Based Approximations for Planning in Stochastic Domains

This paper is concerned with planning in stochastic domains by means of ...
research
07/09/2019

Partially Observable Planning and Learning for Systems with Non-Uniform Dynamics

We propose a neural network architecture, called TransNet, that combines...
research
01/15/2014

A Heuristic Search Approach to Planning with Continuous Resources in Stochastic Domains

We consider the problem of optimal planning in stochastic domains with r...
research
10/16/2012

FHHOP: A Factored Hybrid Heuristic Online Planning Algorithm for Large POMDPs

Planning in partially observable Markov decision processes (POMDPs) rema...
research
01/16/2014

Topological Value Iteration Algorithms

Value iteration is a powerful yet inefficient algorithm for Markov decis...
research
08/02/2020

Dynamic Discrete Choice Estimation with Partially Observable States and Hidden Dynamics

Dynamic discrete choice models are used to estimate the intertemporal pr...

Please sign up or login with your details

Forgot password? Click here to reset