Graph Value Iteration

09/20/2022
by   Dieqiao Feng, et al.
5

In recent years, deep Reinforcement Learning (RL) has been successful in various combinatorial search domains, such as two-player games and scientific discovery. However, directly applying deep RL in planning domains is still challenging. One major difficulty is that without a human-crafted heuristic function, reward signals remain zero unless the learning framework discovers any solution plan. Search space becomes exponentially larger as the minimum length of plans grows, which is a serious limitation for planning instances with a minimum plan length of hundreds to thousands of steps. Previous learning frameworks that augment graph search with deep neural networks and extra generated subgoals have achieved success in various challenging planning domains. However, generating useful subgoals requires extensive domain knowledge. We propose a domain-independent method that augments graph search with graph value iteration to solve hard planning instances that are out of reach for domain-specialized solvers. In particular, instead of receiving learning signals only from discovered plans, our approach also learns from failed search attempts where no goal state has been reached. The graph value iteration component can exploit the graph structure of local search space and provide more informative learning signals. We also show how we use a curriculum strategy to smooth the learning process and perform a full analysis of how graph value iteration scales and enables learning.

READ FULL TEXT

page 2

page 7

research
10/03/2021

A Novel Automated Curriculum Strategy to Solve Hard Sokoban Planning Instances

In recent years, we have witnessed tremendous progress in deep reinforce...
research
06/04/2020

Solving Hard AI Planning Instances Using Curriculum-Driven Deep Reinforcement Learning

Despite significant progress in general AI planning, certain domains rem...
research
06/28/2022

Left Heavy Tails and the Effectiveness of the Policy and Value Networks in DNN-based best-first search for Sokoban Planning

Despite the success of practical solvers in various NP-complete domains ...
research
10/07/2021

Generalization in Deep RL for TSP Problems via Equivariance and Local Search

Deep reinforcement learning (RL) has proved to be a competitive heuristi...
research
01/16/2014

Automatic Induction of Bellman-Error Features for Probabilistic Planning

Domain-specific features are important in representing problem structure...
research
05/05/2020

Generalized Planning With Deep Reinforcement Learning

A hallmark of intelligence is the ability to deduce general principles f...
research
04/27/2020

Learning Neural-Symbolic Descriptive Planning Models via Cube-Space Priors: The Voyage Home (to STRIPS)

We achieved a new milestone in the difficult task of enabling agents to ...

Please sign up or login with your details

Forgot password? Click here to reset