Towards Computational Linguistics in Minangkabau Language: Studies on Sentiment Analysis and Machine Translation

09/19/2020
by   Fajri Koto, et al.
0

Although some linguists (Rusmali et al., 1985; Crouch, 2009) have fairly attempted to define the morphology and syntax of Minangkabau, information processing in this language is still absent due to the scarcity of the annotated resource. In this work, we release two Minangkabau corpora: sentiment analysis and machine translation that are harvested and constructed from Twitter and Wikipedia. We conduct the first computational linguistics in Minangkabau language employing classic machine learning and sequence-to-sequence models such as LSTM and Transformer. Our first experiments show that the classification performance over Minangkabau text significantly drops when tested with the model trained in Indonesian. Whereas, in the machine translation experiment, a simple word-to-word translation using a bilingual dictionary outperforms LSTM and Transformer model in terms of BLEU score.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/28/2020

Preparation of Sentiment tagged Parallel Corpus and Testing its effect on Machine Translation

In the current work, we explore the enrichment in the machine translatio...
research
08/01/2017

Learned in Translation: Contextualized Word Vectors

Computer vision has benefited from initializing multiple deep layers wit...
research
12/15/2016

Building a robust sentiment lexicon with (almost) no resource

Creating sentiment polarity lexicons is labor intensive. Automatically t...
research
06/12/2023

Measuring Sentiment Bias in Machine Translation

Biases induced to text by generative models have become an increasingly ...
research
06/11/2022

Can the Language of the Collation be Translated into the Language of the Stemma? Using Machine Translation for Witness Localization

Stemmatology is a subfield of philology where one approach to understand...
research
08/11/2023

Optimizing transformer-based machine translation model for single GPU training: a hyperparameter ablation study

In machine translation tasks, the relationship between model complexity ...
research
02/22/2021

Parallelizing Legendre Memory Unit Training

Recently, a new recurrent neural network (RNN) named the Legendre Memory...

Please sign up or login with your details

Forgot password? Click here to reset