HuManiFlow: Ancestor-Conditioned Normalising Flows on SO(3) Manifolds for Human Pose and Shape Distribution Estimation

by   Akash Sengupta, et al.

Monocular 3D human pose and shape estimation is an ill-posed problem since multiple 3D solutions can explain a 2D image of a subject. Recent approaches predict a probability distribution over plausible 3D pose and shape parameters conditioned on the image. We show that these approaches exhibit a trade-off between three key properties: (i) accuracy - the likelihood of the ground-truth 3D solution under the predicted distribution, (ii) sample-input consistency - the extent to which 3D samples from the predicted distribution match the visible 2D image evidence, and (iii) sample diversity - the range of plausible 3D solutions modelled by the predicted distribution. Our method, HuManiFlow, predicts simultaneously accurate, consistent and diverse distributions. We use the human kinematic tree to factorise full body pose into ancestor-conditioned per-body-part pose distributions in an autoregressive manner. Per-body-part distributions are implemented using normalising flows that respect the manifold structure of SO(3), the Lie group of per-body-part poses. We show that ill-posed, but ubiquitous, 3D point estimate losses reduce sample diversity, and employ only probabilistic training losses. Code is available at:


page 1

page 3

page 4

page 13

page 15

page 18


Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild

This paper addresses the problem of 3D human body shape and pose estimat...

Probabilistic Estimation of 3D Human Shape and Pose with a Semantic Local Parametric Model

This paper addresses the problem of 3D human body shape and pose estimat...

Error Estimation for Single-Image Human Body Mesh Reconstruction

Human pose and shape estimation methods continue to suffer in situations...

On the Instability of Relative Pose Estimation and RANSAC's Role

In this paper we study the numerical instabilities of the 5- and 7-point...

Probabilistic Human Mesh Recovery in 3D Scenes from Egocentric Views

Automatic perception of human behaviors during social interactions is cr...

Monocular 3D Human Pose Estimation by Generation and Ordinal Ranking

Monocular 3D Human Pose Estimation from static images is a challenging p...

Learning Human Poses from Actions

We consider the task of learning to estimate human pose in still images....

Please sign up or login with your details

Forgot password? Click here to reset