RGB Video Based Tennis Action Recognition Using a Deep Weighted Long Short-Term Memory

08/02/2018
by   Jiaxin Cai, et al.
0

Action recognition has attracted increasing attention from RGB input in computer vision partially due to potential applications on somatic simulation and statistics of sport such as virtual tennis game and tennis techniques and tactics analysis by video. Recently, deep learning based methods have achieved promising performance for action recognition. In this paper, we propose weighted Long Short-Term Memory adopted with convolutional neural network representations for three dimensional tennis shots recognition. First, the local two-dimensional convolutional neural network spatial representations are extracted from each video frame individually using a pre-trained Inception network. Then, a weighted Long Short-Term Memory decoder is introduced to take the output state at time t and the historical embedding feature at time t-1 to generate feature vector using a score weighting scheme. Finally, we use the adopted CNN and weighted LSTM to map the original visual features into a vector space to generate the spatial-temporal semantical description of visual sequences and classify the action video content. Experiments on the benchmark demonstrate that our method using only simple raw RGB video can achieve better performance than the state-of-the-art baselines for tennis shot recognition.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/06/2017

Skeleton-based Action Recognition Using LSTM and CNN

Recent methods based on 3D skeleton data have achieved outstanding perfo...
research
06/16/2020

Global Feature Aggregation for Accident Anticipation

Anticipation of accidents ahead of time in autonomous and non-autonomous...
research
12/10/2019

Listen to Look: Action Recognition by Previewing Audio

In the face of the video data deluge, today's expensive clip-level class...
research
06/05/2022

3D Convolutional with Attention for Action Recognition

Human action recognition is one of the challenging tasks in computer vis...
research
06/07/2019

Risky Action Recognition in Lane Change Video Clips using Deep Spatiotemporal Networks with Segmentation Mask Transfer

Advanced driver assistance and automated driving systems rely on risk es...
research
02/16/2015

Unsupervised Learning of Video Representations using LSTMs

We use multilayer Long Short Term Memory (LSTM) networks to learn repres...
research
11/01/2019

Multimodal Video-based Apparent Personality Recognition Using Long Short-Term Memory and Convolutional Neural Networks

Personality computing and affective computing, where the recognition of ...

Please sign up or login with your details

Forgot password? Click here to reset