Learning 3D Human Pose Estimation from Dozens of Datasets using a Geometry-Aware Autoencoder to Bridge Between Skeleton Formats

by   István Sárándi, et al.

Deep learning-based 3D human pose estimation performs best when trained on large amounts of labeled data, making combined learning from many datasets an important research direction. One obstacle to this endeavor are the different skeleton formats provided by different datasets, i.e., they do not label the same set of anatomical landmarks. There is little prior research on how to best supervise one model with such discrepant labels. We show that simply using separate output heads for different skeletons results in inconsistent depth estimates and insufficient information sharing across skeletons. As a remedy, we propose a novel affine-combining autoencoder (ACAE) method to perform dimensionality reduction on the number of landmarks. The discovered latent 3D points capture the redundancy among skeletons, enabling enhanced information sharing when used for consistency regularization. Our approach scales to an extreme multi-dataset regime, where we use 28 3D human pose datasets to supervise one model, which outperforms prior work on a range of benchmarks, including the challenging 3D Poses in the Wild (3DPW) dataset. Our code and models are available for research purposes.


page 1

page 2

page 16

page 17

page 18


Weakly-Supervised Discovery of Geometry-Aware Representation for 3D Human Pose Estimation

Recent studies have shown remarkable advances in 3D human pose estimatio...

DeepSkeleton: Skeleton Map for 3D Human Pose Regression

Despite recent success on 2D human pose estimation, 3D human pose estima...

Anatomy-aware 3D Human Pose Estimation in Videos

In this work, we propose a new solution for 3D human pose estimation in ...

Unsupervised Geometry-Aware Representation for 3D Human Pose Estimation

Modern 3D human pose estimation techniques rely on deep networks, which ...

Exemplar Fine-Tuning for 3D Human Pose Fitting Towards In-the-Wild 3D Human Pose Estimation

We propose a method for building large collections of human poses with f...

Structure-Aware and Temporally Coherent 3D Human Pose Estimation

Deep learning methods for 3D human pose estimation from RGB images requi...

Unite the People: Closing the Loop Between 3D and 2D Human Representations

3D models provide a common ground for different representations of human...

Please sign up or login with your details

Forgot password? Click here to reset