SLGTformer: An Attention-Based Approach to Sign Language Recognition

12/21/2022
by   Neil Song, et al.
0

Sign language is the preferred method of communication of deaf or mute people, but similar to any language, it is difficult to learn and represents a significant barrier for those who are hard of hearing or unable to speak. A person's entire frontal appearance dictates and conveys specific meaning. However, this frontal appearance can be quantified as a temporal sequence of human body pose, leading to Sign Language Recognition through the learning of spatiotemporal dynamics of skeleton keypoints. We propose a novel, attention-based approach to Sign Language Recognition exclusively built upon decoupled graph and temporal self-attention: the Sign Language Graph Time Transformer (SLGTformer). SLGTformer first deconstructs spatiotemporal pose sequences separately into spatial graphs and temporal windows. SLGTformer then leverages novel Learnable Graph Relative Positional Encodings (LGRPE) to guide spatial self-attention with the graph neighborhood context of the human skeleton. By modeling the temporal dimension as intra- and inter-window dynamics, we introduce Temporal Twin Self-Attention (TTSA) as the combination of locally-grouped temporal attention (LTA) and global sub-sampled temporal attention (GSTA). We demonstrate the effectiveness of SLGTformer on the World-Level American Sign Language (WLASL) dataset, achieving state-of-the-art performance with an ensemble-free approach on the keypoint modality. The code is available at https://github.com/neilsong/slt

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/06/2021

Skeletal Graph Self-Attention: Embedding a Skeleton Inductive Bias into Sign Language Production

Recent approaches to Sign Language Production (SLP) have adopted spoken ...
research
02/22/2023

Multi-View Bangla Sign Language(MV-BSL) Dataset and Continuous BSL Recognition

Being able to express our thoughts, feelings, and ideas to one another i...
research
10/12/2021

Sign Language Recognition via Skeleton-Aware Multi-Model Ensemble

Sign language is commonly used by deaf or mute people to communicate but...
research
04/10/2023

Isolated Sign Language Recognition based on Tree Structure Skeleton Images

Sign Language Recognition (SLR) systems aim to be embedded in video stre...
research
08/18/2023

Human Part-wise 3D Motion Context Learning for Sign Language Recognition

In this paper, we propose P3D, the human part-wise motion context learni...
research
02/03/2022

Exploring Sub-skeleton Trajectories for Interpretable Recognition of Sign Language

Recent advances in tracking sensors and pose estimation software enable ...
research
12/01/2020

Pose-based Sign Language Recognition using GCN and BERT

Sign language recognition (SLR) plays a crucial role in bridging the com...

Please sign up or login with your details

Forgot password? Click here to reset