DeepAI AI Chat
Log In Sign Up

Clean Text and Full-Body Transformer: Microsoft's Submission to the WMT22 Shared Task on Sign Language Translation

by   Subhadeep Dey, et al.

This paper describes Microsoft's submission to the first shared task on sign language translation at WMT 2022, a public competition tackling sign language to spoken language translation for Swiss German sign language. The task is very challenging due to data scarcity and an unprecedented vocabulary size of more than 20k words on the target side. Moreover, the data is taken from real broadcast news, includes native signing and covers scenarios of long videos. Motivated by recent advances in action recognition, we incorporate full body information by extracting features from a pre-trained I3D model and applying a standard transformer network. The accuracy of the system is further improved by applying careful data cleaning on the target text. We obtain BLEU scores of 0.6 and 0.78 on the test and dev set respectively, which is the best score among the participants of the shared task. Also in the human evaluation the submission reaches the first place. The BLEU score is further improved to 1.08 on the dev set by applying features extracted from a lip reading model.


page 1

page 2

page 3

page 4


Tackling Low-Resourced Sign Language Translation: UPC at WMT-SLT 22

This paper describes the system developed at the Universitat Politècnica...

Sign Language Translation from Instructional Videos

The advances in automatic sign language translation (SLT) to spoken lang...

Sign Language Translation with Transformers

Sign Language Translation (SLT) first uses a Sign Language Recognition (...

Explore More Guidance: A Task-aware Instruction Network for Sign Language Translation Enhanced with Data Augmentation

Sign language recognition and translation first uses a recognition modul...

SignNet: Single Channel Sign Generation using Metric Embedded Learning

A true interpreting agent not only understands sign language and transla...

SIGMORPHON 2023 Shared Task of Interlinear Glossing: Baseline Model

Language documentation is a critical aspect of language preservation, of...

Enhancing Portuguese Sign Language Animation with Dynamic Timing and Mouthing

Current signing avatars are often described as unnatural as they cannot ...