PQ-Transformer: Jointly Parsing 3D Objects and Layouts from Point Clouds

by   Xiaoxue Chen, et al.

3D scene understanding from point clouds plays a vital role for various robotic applications. Unfortunately, current state-of-the-art methods use separate neural networks for different tasks like object detection or room layout estimation. Such a scheme has two limitations: 1) Storing and running several networks for different tasks are expensive for typical robotic platforms. 2) The intrinsic structure of separate outputs are ignored and potentially violated. To this end, we propose the first transformer architecture that predicts 3D objects and layouts simultaneously, using point cloud inputs. Unlike existing methods that either estimate layout keypoints or edges, we directly parameterize room layout as a set of quads. As such, the proposed architecture is termed as P(oint)Q(uad)-Transformer. Along with the novel quad representation, we propose a tailored physical constraint loss function that discourages object-layout interference. The quantitative and qualitative evaluations on the public benchmark ScanNet show that the proposed PQ-Transformer succeeds to jointly parse 3D objects and layouts, running at a quasi-real-time (8.91 FPS) rate without efficiency-oriented optimization. Moreover, the new physical constraint loss can improve strong baselines, and the F1-score of the room layout is significantly promoted from 37.9


LGT-Net: Indoor Panoramic Room Layout Estimation with Geometry-Aware Transformer Network

3D room layout estimation by a single panorama using deep neural network...

Floorplan Priors for Joint Camera Pose and Room Layout Estimation

We present a novel approach to reconstruct large or featureless scenes. ...

Iterative Transformer Network for 3D Point Cloud

3D point cloud is an efficient and flexible representation of 3D structu...

U2RLE: Uncertainty-Guided 2-Stage Room Layout Estimation

While the existing deep learning-based room layout estimation techniques...

Bridged Transformer for Vision and Point Cloud 3D Object Detection

3D object detection is a crucial research topic in computer vision, whic...

Physics Inspired Optimization on Semantic Transfer Features: An Alternative Method for Room Layout Estimation

In this paper, we propose an alternative method to estimate room layouts...

MCTS with Refinement for Proposals Selection Games in Scene Understanding

We propose a novel method applicable in many scene understanding problem...

Please sign up or login with your details

Forgot password? Click here to reset