Compositional Foundation Models for Hierarchical Planning

09/15/2023
by   Anurag Ajay, et al.
0

To make effective decisions in novel environments with long-horizon goals, it is crucial to engage in hierarchical reasoning across spatial and temporal scales. This entails planning abstract subgoal sequences, visually reasoning about the underlying plans, and executing actions in accordance with the devised plan through visual-motor control. We propose Compositional Foundation Models for Hierarchical Planning (HiP), a foundation model which leverages multiple expert foundation model trained on language, vision and action data individually jointly together to solve long-horizon tasks. We use a large language model to construct symbolic plans that are grounded in the environment through a large video diffusion model. Generated video plans are then grounded to visual-motor control, through an inverse dynamics model that infers actions from generated videos. To enable effective reasoning within this hierarchy, we enforce consistency between the models via iterative refinement. We illustrate the efficacy and adaptability of our approach in three different long-horizon table-top manipulation tasks.

READ FULL TEXT

page 6

page 8

research
12/14/2020

Hierarchical Planning for Long-Horizon Manipulation with Geometric and Symbolic Scene Graphs

We present a visually grounded hierarchical planning algorithm for long-...
research
05/12/2023

Learning to Reason over Scene Graphs: A Case Study of Finetuning GPT-2 into a Robot Language Model for Grounded Task Planning

Long-horizon task planning is essential for the development of intellige...
research
03/21/2023

Text2Motion: From Natural Language Instructions to Feasible Plans

We propose Text2Motion, a language-based planning framework enabling rob...
research
10/27/2022

Planning with Spatial-Temporal Abstraction from Point Clouds for Deformable Object Manipulation

Effective planning of long-horizon deformable object manipulation requir...
research
06/22/2023

DiMSam: Diffusion Models as Samplers for Task and Motion Planning under Partial Observability

Task and Motion Planning (TAMP) approaches are effective at planning lon...
research
08/09/2017

Addendum to: Summary Information for Reasoning About Hierarchical Plans

Hierarchically structured agent plans are important for efficient planni...
research
09/12/2019

Hierarchical Foresight: Self-Supervised Learning of Long-Horizon Tasks via Visual Subgoal Generation

Video prediction models combined with planning algorithms have shown pro...

Please sign up or login with your details

Forgot password? Click here to reset