Towards Automatic Speech to Sign Language Generation

06/24/2021
by   Parul Kapoor, et al.
16

We aim to solve the highly challenging task of generating continuous sign language videos solely from speech segments for the first time. Recent efforts in this space have focused on generating such videos from human-annotated text transcripts without considering other modalities. However, replacing speech with sign language proves to be a practical solution while communicating with people suffering from hearing loss. Therefore, we eliminate the need of using text as input and design techniques that work for more natural, continuous, freely uttered speech covering an extensive vocabulary. Since the current datasets are inadequate for generating sign language directly from speech, we collect and release the first Indian sign language dataset comprising speech-level annotations, text transcripts, and the corresponding sign-language videos. Next, we propose a multi-tasking transformer network trained to generate signer's poses from speech segments. With speech-to-text as an auxiliary task and an additional cross-modal discriminator, our model learns to generate continuous sign pose sequences in an end-to-end manner. Extensive experiments and comparisons with other baselines demonstrate the effectiveness of our approach. We also conduct additional ablation studies to analyze the effect of different modules of our network. A demo video containing several results is attached to the supplementary material.

READ FULL TEXT

page 1

page 2

research
12/20/2020

Can Everybody Sign Now? Exploring Sign Language Video Generation from 2D Poses

Recent work have addressed the generation of human poses represented by ...
research
03/30/2021

Read and Attend: Temporal Localisation in Sign Language Videos

The objective of this work is to annotate sign instances across a broad ...
research
05/06/2021

Aligning Subtitles in Sign Language Videos

The goal of this work is to temporally align asynchronous subtitles in s...
research
08/18/2020

How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language

Sign Language is the primary means of communication for the majority of ...
research
05/18/2023

Cross-modality Data Augmentation for End-to-End Sign Language Translation

End-to-end sign language translation (SLT) aims to convert sign language...
research
11/01/2021

Sign-to-Speech Model for Sign Language Understanding: A Case Study of Nigerian Sign Language

Through this paper, we seek to reduce the communication barrier between ...
research
10/11/2020

Boosting Continuous Sign Language Recognition via Cross Modality Augmentation

Continuous sign language recognition (SLR) deals with unaligned video-te...

Please sign up or login with your details

Forgot password? Click here to reset