Synthesizing Diverse Human Motions in 3D Indoor Scenes

by   Kaifeng Zhao, et al.

We present a novel method for populating 3D indoor scenes with virtual humans that can navigate the environment and interact with objects in a realistic manner. Existing approaches rely on high-quality training sequences that capture a diverse range of human motions in 3D scenes. However, such motion data is costly, difficult to obtain and can never cover the full range of plausible human-scene interactions in complex indoor environments. To address these challenges, we propose a reinforcement learning-based approach to learn policy networks that predict latent variables of a powerful generative motion model that is trained on a large-scale motion capture dataset (AMASS). For navigating in a 3D environment, we propose a scene-aware policy training scheme with a novel collision avoidance reward function. Combined with the powerful generative motion model, we can synthesize highly diverse human motions navigating 3D indoor scenes, meanwhile effectively avoiding obstacles. For detailed human-object interactions, we carefully curate interaction-aware reward functions by leveraging a marker-based body representation and the signed distance field (SDF) representation of the 3D scene. With a number of important training design schemes, our method can synthesize realistic and diverse human-object interactions (e.g., sitting on a chair and then getting up) even for out-of-distribution test scenarios with different object shapes, orientations, starting body positions, and poses. Experimental results demonstrate that our approach outperforms state-of-the-art human-scene interaction synthesis frameworks in terms of both motion naturalness and diversity. Video results are available on the project page:


page 1

page 7

page 8


MIME: Human-Aware 3D Scene Generation

Generating realistic 3D worlds occupied by moving humans has many applic...

Locomotion-Action-Manipulation: Synthesizing Human-Scene Interactions in Complex 3D Environments

Synthesizing interaction-involved human motions has been challenging due...

Towards Diverse and Natural Scene-aware 3D Human Motion Synthesis

The ability to synthesize long-term human motion sequences in real-world...

Learning Motion Priors for 4D Human Body Capture in 3D Scenes

Recovering high-quality 3D human motion in complex scenes from monocular...

Stochastic Scene-Aware Motion Prediction

A long-standing goal in computer vision is to capture, model, and realis...

HSC4D: Human-centered 4D Scene Capture in Large-scale Indoor-outdoor Space Using Wearable IMUs and LiDAR

We propose Human-centered 4D Scene Capture (HSC4D) to accurately and eff...

ROAM: Robust and Object-aware Motion Generation using Neural Pose Descriptors

Existing automatic approaches for 3D virtual character motion synthesis ...

Please sign up or login with your details

Forgot password? Click here to reset