Joint-bone Fusion Graph Convolutional Network for Semi-supervised Skeleton Action Recognition

by   Zhigang Tu, et al.

In recent years, graph convolutional networks (GCNs) play an increasingly critical role in skeleton-based human action recognition. However, most GCN-based methods still have two main limitations: 1) They only consider the motion information of the joints or process the joints and bones separately, which are unable to fully explore the latent functional correlation between joints and bones for action recognition. 2) Most of these works are performed in the supervised learning way, which heavily relies on massive labeled training data. To address these issues, we propose a semi-supervised skeleton-based action recognition method which has been rarely exploited before. We design a novel correlation-driven joint-bone fusion graph convolutional network (CD-JBF-GCN) as an encoder and use a pose prediction head as a decoder to achieve semi-supervised learning. Specifically, the CD-JBF-GC can explore the motion transmission between the joint stream and the bone stream, so that promoting both streams to learn more discriminative feature representations. The pose prediction based auto-encoder in the self-supervised training stage allows the network to learn motion representation from unlabeled data, which is essential for action recognition. Extensive experiments on two popular datasets, i.e. NTU-RGB+D and Kinetics-Skeleton, demonstrate that our model achieves the state-of-the-art performance for semi-supervised skeleton-based action recognition and is also useful for fully-supervised methods.


page 1

page 6

page 8

page 13


Pose-Guided Graph Convolutional Networks for Skeleton-Based Action Recognition

Graph convolutional networks (GCNs), which can model the human body skel...

JOLO-GCN: Mining Joint-Centered Light-Weight Information for Skeleton-Based Action Recognition

Skeleton-based action recognition has attracted research attentions in r...

Adversarial Self-Supervised Learning for Semi-Supervised 3D Action Recognition

We consider the problem of semi-supervised 3D action recognition which h...

PSUMNet: Unified Modality Part Streams are All You Need for Efficient Pose-based Action Recognition

Pose-based action recognition is predominantly tackled by approaches whi...

Learning from Temporal Gradient for Semi-supervised Action Recognition

Semi-supervised video action recognition tends to enable deep neural net...

Iterate Cluster: Iterative Semi-Supervised Action Recognition

We propose a novel system for active semi-supervised feature-based actio...

Train, Diagnose and Fix: Interpretable Approach for Fine-grained Action Recognition

Despite the growing discriminative capabilities of modern deep learning ...

Please sign up or login with your details

Forgot password? Click here to reset