Robust Human Motion Forecasting using Transformer-based Model

by   Esteve Valls Mascaro, et al.

Comprehending human motion is a fundamental challenge for developing Human-Robot Collaborative applications. Computer vision researchers have addressed this field by only focusing on reducing error in predictions, but not taking into account the requirements to facilitate its implementation in robots. In this paper, we propose a new model based on Transformer that simultaneously deals with the real time 3D human motion forecasting in the short and long term. Our 2-Channel Transformer (2CH-TR) is able to efficiently exploit the spatio-temporal information of a shortly observed sequence (400ms) and generates a competitive accuracy against the current state-of-the-art. 2CH-TR stands out for the efficient performance of the Transformer, being lighter and faster than its competitors. In addition, our model is tested in conditions where the human motion is severely occluded, demonstrating its robustness in reconstructing and predicting 3D human motion in a highly noisy environment. Our experiment results show that the proposed 2CH-TR outperforms the ST-Transformer, which is another state-of-the-art model based on the Transformer, in terms of reconstruction and prediction under the same conditions of input prefix. Our model reduces in 8.89 of ST-Transformer in short-term prediction, and 2.57 in Human3.6M dataset with 400ms input prefix.


page 1

page 5

page 7


Stecformer: Spatio-temporal Encoding Cascaded Transformer for Multivariate Long-term Time Series Forecasting

Multivariate long-term time series forecasting is of great application a...

TENET: Transformer Encoding Network for Effective Temporal Flow on Motion Prediction

This technical report presents an effective method for motion prediction...

Solar Irradiance Anticipative Transformer

This paper proposes an anticipative transformer-based model for short-te...

Convolutional Sequence to Sequence Model for Human Dynamics

Human motion modeling is a classic problem in computer vision and graphi...

Pose Transformers (POTR): Human Motion Prediction with Non-Autoregressive Transformers

We propose to leverage Transformer architectures for non-autoregressive ...

TransFusion: A Practical and Effective Transformer-based Diffusion Model for 3D Human Motion Prediction

Predicting human motion plays a crucial role in ensuring a safe and effe...

Entry-Flipped Transformer for Inference and Prediction of Participant Behavior

Some group activities, such as team sports and choreographed dances, inv...

Please sign up or login with your details

Forgot password? Click here to reset