Shelving, Stacking, Hanging: Relational Pose Diffusion for Multi-modal Rearrangement

07/10/2023
by   Anthony Simeonov, et al.
0

We propose a system for rearranging objects in a scene to achieve a desired object-scene placing relationship, such as a book inserted in an open slot of a bookshelf. The pipeline generalizes to novel geometries, poses, and layouts of both scenes and objects, and is trained from demonstrations to operate directly on 3D point clouds. Our system overcomes challenges associated with the existence of many geometrically-similar rearrangement solutions for a given scene. By leveraging an iterative pose de-noising training procedure, we can fit multi-modal demonstration data and produce multi-modal outputs while remaining precise and accurate. We also show the advantages of conditioning on relevant local geometric features while ignoring irrelevant global structure that harms both generalization and precision. We demonstrate our approach on three distinct rearrangement tasks that require handling multi-modality and generalization over object shape and pose in both simulation and the real world. Project website, code, and videos: https://anthonysimeonov.github.io/rpdiff-multi-modal/

READ FULL TEXT

page 2

page 3

page 8

page 16

page 17

page 18

page 24

page 31

research
01/03/2023

Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection

In this paper, we propose a robust 3D detector, named Cross Modal Transf...
research
01/17/2023

A Large-Scale Outdoor Multi-modal Dataset and Benchmark for Novel View Synthesis and Implicit Scene Reconstruction

Neural Radiance Fields (NeRF) has achieved impressive results in single ...
research
08/01/2023

Human-M3: A Multi-view Multi-modal Dataset for 3D Human Pose Estimation in Outdoor Scenes

3D human pose estimation in outdoor environments has garnered increasing...
research
06/07/2021

Learning to Detect Multi-Modal Grasps for Dexterous Grasping in Dense Clutter

Grasping arbitrary objects in densely cluttered novel environments is a ...
research
08/21/2023

Multi-Modal Dataset Acquisition for Photometrically Challenging Object

This paper addresses the limitations of current datasets for 3D vision t...
research
11/17/2022

SE(3)-Equivariant Relational Rearrangement with Neural Descriptor Fields

We present a method for performing tasks involving spatial relations bet...
research
06/19/2023

UniG3D: A Unified 3D Object Generation Dataset

The field of generative AI has a transformative impact on various areas,...

Please sign up or login with your details

Forgot password? Click here to reset