CNN-based Action Recognition and Supervised Domain Adaptation on 3D Body Skeletons via Kernel Feature Maps

by   Yusuf Tas, et al.

Deep learning is ubiquitous across many areas areas of computer vision. It often requires large scale datasets for training before being fine-tuned on small-to-medium scale problems. Activity, or, in other words, action recognition, is one of many application areas of deep learning. While there exist many Convolutional Neural Network architectures that work with the RGB and optical flow frames, training on the time sequences of 3D body skeleton joints is often performed via recurrent networks such as LSTM. In this paper, we propose a new representation which encodes sequences of 3D body skeleton joints in texture-like representations derived from mathematically rigorous kernel methods. Such a representation becomes the first layer in a standard CNN network e.g., ResNet-50, which is then used in the supervised domain adaptation pipeline to transfer information from the source to target dataset. This lets us leverage the available Kinect-based data beyond training on a single dataset and outperform simple fine-tuning on any two datasets combined in a naive manner. More specifically, in this paper we utilize the overlapping classes between datasets. We associate datapoints of the same class via so-called commonality, known from the supervised domain adaptation. We demonstrate state-of-the-art results on three publicly available benchmarks.


page 1

page 2

page 3

page 4


Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based Action Recognition

Rapid progress and superior performance have been achieved for skeleton-...

Open Set Domain Adaptation for Image and Action Recognition

Since annotating and curating large datasets is very expensive, there is...

One-Shot Learning for Periocular Recognition: Exploring the Effect of Domain Adaptation and Data Bias on Deep Representations

One weakness of machine-learning algorithms is the need to train the mod...

Hallucinating Bag-of-Words and Fisher Vector IDT terms for CNN-based Action Recognition

In this paper, we revive the use of old-fashioned handcrafted video repr...

Deep Domain Confusion: Maximizing for Domain Invariance

Recent reports suggest that a generic supervised deep CNN model trained ...

When Kernel Methods meet Feature Learning: Log-Covariance Network for Action Recognition from Skeletal Data

Human action recognition from skeletal data is a hot research topic and ...

DDLSTM: Dual-Domain LSTM for Cross-Dataset Action Recognition

Domain alignment in convolutional networks aims to learn the degree of l...

Please sign up or login with your details

Forgot password? Click here to reset