Towards Character-Level Transformer NMT by Finetuning Subword Systems

04/29/2020
by   Jindřich Libovický, et al.
0

Applying the Transformer architecture on the character level usually requires very deep architectures that are difficult and slow to train. A few approaches have been proposed that partially overcome this problem by using explicit segmentation into tokens. We show that by initially training a subword model based on this segmentation and then finetuning it on characters, we can obtain a neural machine translation model that works at the character level without requiring segmentation. Without changing the vanilla 6-layer Transformer Base architecture, we train purely character-level models. Our character-level models better capture morphological phenomena and show much higher robustness towards source-side noise at the expense of somewhat worse overall translation quality. Our study is a significant step towards high-performance character-based models that are not extremely large.

READ FULL TEXT
research
08/08/2023

Character-level NMT and language similarity

We explore the effectiveness of character-level neural machine translati...
research
11/12/2019

Character-based NMT with Transformer

Character-based translation has several appealing advantages, but its pe...
research
09/30/2021

SCIMAT: Science and Mathematics Dataset

In this work, we announce a comprehensive well curated and opensource da...
research
09/10/2020

On Target Segmentation for Direct Speech Translation

Recent studies on direct speech translation show continuous improvements...
research
12/02/2022

Subword-Delimited Downsampling for Better Character-Level Translation

Subword-level models have been the dominant paradigm in NLP. However, ch...
research
05/26/2023

TranSFormer: Slow-Fast Transformer for Machine Translation

Learning multiscale Transformer models has been evidenced as a viable ap...
research
05/27/2022

Patching Leaks in the Charformer for Efficient Character-Level Generation

Character-based representations have important advantages over subword-b...

Please sign up or login with your details

Forgot password? Click here to reset