Unsupervised Feature Learning of Human Actions as Trajectories in Pose Embedding Manifold

by   Jogendra Nath Kundu, et al.

An unsupervised human action modeling framework can provide useful pose-sequence representation, which can be utilized in a variety of pose analysis applications. In this work we propose a novel temporal pose-sequence modeling framework, which can embed the dynamics of 3D human-skeleton joints to a continuous latent space in an efficient manner. In contrast to end-to-end framework explored by previous works, we disentangle the task of individual pose representation learning from the task of learning actions as a trajectory in pose embedding space. In order to realize a continuous pose embedding manifold with improved reconstructions, we propose an unsupervised, manifold learning procedure named Encoder GAN, (or EnGAN). Further, we use the pose embeddings generated by EnGAN to model human actions using a bidirectional RNN auto-encoder architecture, PoseRNN. We introduce first-order gradient loss to explicitly enforce temporal regularity in the predicted motion sequence. A hierarchical feature fusion technique is also investigated for simultaneous modeling of local skeleton joints along with global pose variations. We demonstrate state-of-the-art transfer-ability of the learned representation against other supervisedly and unsupervisedly learned motion embeddings for the task of fine-grained action recognition on SBU interaction dataset. Further, we show the qualitative strengths of the proposed framework by visualizing skeleton pose reconstructions and interpolations in pose-embedding space, and low dimensional principal component projections of the reconstructed pose trajectories.


page 1

page 2

page 3

page 4


Unsupervised Human 3D Pose Representation with Viewpoint and Pose Disentanglement

Learning a good 3D human pose representation is important for human pose...

Self-Supervised 3D Action Representation Learning with Skeleton Cloud Colorization

3D Skeleton-based human action recognition has attracted increasing atte...

3D Skeleton-based Human Motion Prediction with Manifold-Aware GAN

In this work we propose a novel solution for 3D skeleton-based human mot...

Human Mesh Recovery from Monocular Images via a Skeleton-disentangled Representation

We describe an end-to-end method for recovering 3D human body mesh from ...

FenceNet: Fine-grained Footwork Recognition in Fencing

Current data analysis for the Canadian Olympic fencing team is primarily...

Feature Space Transfer for Data Augmentation

The problem of data augmentation in feature space is considered. A new a...

BiHMP-GAN: Bidirectional 3D Human Motion Prediction GAN

Human motion prediction model has applications in various fields of comp...

Please sign up or login with your details

Forgot password? Click here to reset