H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding Object Articulations from Interactions

by   Kei Ota, et al.
Mitsubishi Electric Corporation

The world is filled with articulated objects that are difficult to determine how to use from vision alone, e.g., a door might open inwards or outwards. Humans handle these objects with strategic trial-and-error: first pushing a door then pulling if that doesn't work. We enable these capabilities in autonomous agents by proposing "Hypothesize, Simulate, Act, Update, and Repeat" (H-SAUR), a probabilistic generative framework that simultaneously generates a distribution of hypotheses about how objects articulate given input observations, captures certainty over hypotheses over time, and infer plausible actions for exploration and goal-conditioned manipulation. We compare our model with existing work in manipulating objects after a handful of exploration actions, on the PartNet-Mobility dataset. We further propose a novel PuzzleBoxes benchmark that contains locked boxes that require multiple steps to solve. We show that the proposed model significantly outperforms the current state-of-the-art articulated object manipulation framework, despite using zero training data. We further improve the test-time efficiency of H-SAUR by integrating a learned prior from learning-based vision models.


page 1

page 4

page 6

page 9


The Tools Challenge: Rapid Trial-and-Error Learning in Physical Problem Solving

Many animals, and an increasing number of artificial agents, display sop...

FOCUS: Object-Centric World Models for Robotics Manipulation

Understanding the world in terms of objects and the possible interplays ...

Learning to Rearrange Deformable Cables, Fabrics, and Bags with Goal-Conditioned Transporter Networks

Rearranging and manipulating deformable objects such as cables, fabrics,...

UMPNet: Universal Manipulation Policy Network for Articulated Objects

We introduce the Universal Manipulation Policy Network (UMPNet) – a sing...

SORNet: Spatial Object-Centric Representations for Sequential Manipulation

Sequential manipulation tasks require a robot to perceive the state of a...

POMDP Manipulation Planning under Object Composition Uncertainty

Manipulating unknown objects in a cluttered environment is difficult bec...

Learning Hybrid Object Kinematics for Efficient Hierarchical Planning Under Uncertainty

Sudden changes in the dynamics of robotic tasks, such as contact with an...

Please sign up or login with your details

Forgot password? Click here to reset