Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos

03/04/2021
by   Yasamin Jafarian, et al.
12

A key challenge of learning the geometry of dressed humans lies in the limited availability of the ground truth data (e.g., 3D scanned models), which results in the performance degradation of 3D human reconstruction when applying to real-world imagery. We address this challenge by leveraging a new data resource: a number of social media dance videos that span diverse appearance, clothing styles, performances, and identities. Each video depicts dynamic movements of the body and clothes of a single person while lacking the 3D ground truth geometry. To utilize these videos, we present a new method to use the local transformation that warps the predicted local geometry of the person from an image to that of another image at a different time instant. This allows self-supervision as enforcing a temporal coherence over the predictions. In addition, we jointly learn the depth along with the surface normals that are highly responsive to local texture, wrinkle, and shade by maximizing their geometric consistency. Our method is end-to-end trainable, resulting in high fidelity depth estimation that predicts fine geometry faithful to the input real image. We demonstrate that our method outperforms the state-of-the-art human depth estimation and human shape recovery approaches on both real and rendered images.

READ FULL TEXT

page 2

page 3

page 4

page 5

page 9

page 11

page 14

page 15

research
03/29/2021

Adaptive Surface Normal Constraint for Depth Estimation

We present a novel method for single image depth estimation using surfac...
research
05/21/2019

SharpNet: Fast and Accurate Recovery of Occluding Contours in Monocular Depth Estimation

We introduce SharpNet, a method that predicts an accurate depth map for ...
research
02/03/2022

Boosting Monocular Depth Estimation with Sparse Guided Points

Existing monocular depth estimation shows excellent robustness in the wi...
research
02/26/2019

Region Deformer Networks for Unsupervised Depth Estimation from Unconstrained Monocular Videos

While learning based depth estimation from images/videos has achieved su...
research
12/19/2020

Self-supervised monocular depth estimation from oblique UAV videos

UAVs have become an essential photogrammetric measurement as they are af...
research
08/17/2021

ARCH++: Animation-Ready Clothed Human Reconstruction Revisited

We present ARCH++, an image-based method to reconstruct 3D avatars with ...
research
06/28/2020

Interpretable Deepfake Detection via Dynamic Prototypes

Deepfake is one notorious application of deep learning research, leading...

Please sign up or login with your details

Forgot password? Click here to reset