Multi-view Human Body Mesh Translator

by   Xiangjian Jiang, et al.

Existing methods for human mesh recovery mainly focus on single-view frameworks, but they often fail to produce accurate results due to the ill-posed setup. Considering the maturity of the multi-view motion capture system, in this paper, we propose to solve the prior ill-posed problem by leveraging multiple images from different views, thus significantly enhancing the quality of recovered meshes. In particular, we present a novel Multi-view human body Mesh Translator (MMT) model for estimating human body mesh with the help of vision transformer. Specifically, MMT takes multi-view images as input and translates them to targeted meshes in a single-forward manner. MMT fuses features of different views in both encoding and decoding phases, leading to representations embedded with global information. Additionally, to ensure the tokens are intensively focused on the human pose and shape, MMT conducts cross-view alignment at the feature level by projecting 3D keypoint positions to each view and enforcing their consistency in geometry constraints. Comprehensive experiments demonstrate that MMT outperforms existing single or multi-view models by a large margin for human mesh recovery task, notably, 28.8% improvement in MPVE over the current state-of-the-art method on the challenging HUMBI dataset. Qualitative evaluation also verifies the effectiveness of MMT in reconstructing high-quality human mesh. Codes will be made available upon acceptance.


page 2

page 9


Shape-Aware Human Pose and Shape Reconstruction Using Multi-View Images

We propose a scalable neural network framework to reconstruct the 3D mes...

Delving Deep into Pixel Alignment Feature for Accurate Multi-view Human Mesh Recovery

Regression-based methods have shown high efficiency and effectiveness fo...

Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans

This paper addresses the challenge of novel view synthesis for a human p...

Progressive Multi-view Human Mesh Recovery with Self-Supervision

To date, little attention has been given to multi-view 3D human mesh est...

Pixel2ISDF: Implicit Signed Distance Fields based Human Body Model from Multi-view and Multi-pose Images

In this report, we focus on reconstructing clothed humans in the canonic...

UMIFormer: Mining the Correlations between Similar Tokens for Multi-View 3D Reconstruction

In recent years, many video tasks have achieved breakthroughs by utilizi...

Detailed Garment Recovery from a Single-View Image

Most recent garment capturing techniques rely on acquiring multiple view...

Please sign up or login with your details

Forgot password? Click here to reset