ZS-SLR: Zero-Shot Sign Language Recognition from RGB-D Videos

08/23/2021
by   Razieh Rastgoo, et al.
0

Sign Language Recognition (SLR) is a challenging research area in computer vision. To tackle the annotation bottleneck in SLR, we formulate the problem of Zero-Shot Sign Language Recognition (ZS-SLR) and propose a two-stream model from two input modalities: RGB and Depth videos. To benefit from the vision Transformer capabilities, we use two vision Transformer models, for human detection and visual features representation. We configure a transformer encoder-decoder architecture, as a fast and accurate human detection model, to overcome the challenges of the current human detection models. Considering the human keypoints, the detected human body is segmented into nine parts. A spatio-temporal representation from human body is obtained using a vision Transformer and a LSTM network. A semantic space maps the visual features to the lingual embedding of the class labels via a Bidirectional Encoder Representations from Transformers (BERT) model. We evaluated the proposed model on four datasets, Montalbano II, MSR Daily Activity 3D, CAD-60, and NTU-60, obtaining state-of-the-art results compared to state-of-the-art ZS-SLR models.

READ FULL TEXT
research
09/02/2021

Multi-Modal Zero-Shot Sign Language Recognition

Zero-Shot Learning (ZSL) has rapidly advanced in recent years. Towards o...
research
07/24/2019

Zero-Shot Sign Language Recognition: Can Textual Data Uncover Sign Languages?

We introduce the problem of zero-shot sign language recognition (ZSSLR),...
research
12/16/2021

TransZero++: Cross Attribute-Guided Transformer for Zero-Shot Learning

Zero-shot learning (ZSL) tackles the novel class recognition problem by ...
research
11/02/2022

Two-Stream Network for Sign Language Recognition and Translation

Sign languages are visual languages using manual articulations and non-m...
research
03/21/2023

Natural Language-Assisted Sign Language Recognition

Sign languages are visual languages which convey information by signers'...
research
06/18/2022

VReBERT: A Simple and Flexible Transformer for Visual Relationship Detection

Visual Relationship Detection (VRD) impels a computer vision model to 's...
research
04/05/2022

A Transformer-Based Contrastive Learning Approach for Few-Shot Sign Language Recognition

Sign language recognition from sequences of monocular images or 2D poses...

Please sign up or login with your details

Forgot password? Click here to reset