NLNDE at SemEval-2023 Task 12: Adaptive Pretraining and Source Language Selection for Low-Resource Multilingual Sentiment Analysis

04/28/2023
by   Mingyang Wang, et al.
0

This paper describes our system developed for the SemEval-2023 Task 12 "Sentiment Analysis for Low-resource African Languages using Twitter Dataset". Sentiment analysis is one of the most widely studied applications in natural language processing. However, most prior work still focuses on a small number of high-resource languages. Building reliable sentiment analysis systems for low-resource languages remains challenging, due to the limited training data in this task. In this work, we propose to leverage language-adaptive and task-adaptive pretraining on African texts and study transfer learning with source language selection on top of an African language-centric pretrained language model. Our key findings are: (1) Adapting the pretrained model to the target language and task using a small yet relevant corpus improves performance remarkably by more than 10 F1 score points. (2) Selecting source languages with positive transfer gains during training can avoid harmful interference from dissimilar languages, leading to better results in multilingual and cross-lingual settings. In the shared task, our system wins 8 out of 15 tracks and, in particular, performs best in the multilingual evaluation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/04/2023

DN at SemEval-2023 Task 12: Low-Resource Language Text Classification via Multilingual Pretrained Language Model Fine-tuning

In recent years, sentiment analysis has gained significant importance in...
research
04/10/2023

Transfer Learning for Low-Resource Sentiment Analysis

Sentiment analysis is the process of identifying and extracting subjecti...
research
04/22/2022

A Vocabulary-Free Multilingual Neural Tokenizer for End-to-End Task Learning

Subword tokenization is a commonly used input pre-processing step in mos...
research
06/07/2018

Semi-supervised and Transfer learning approaches for low resource sentiment classification

Sentiment classification involves quantifying the affective reaction of ...
research
02/12/2021

Emoji-Based Transfer Learning for Sentiment Tasks

Sentiment tasks such as hate speech detection and sentiment analysis, es...
research
05/01/2019

A system for the 2019 Sentiment, Emotion and Cognitive State Task of DARPAs LORELEI project

During the course of a Humanitarian Assistance-Disaster Relief (HADR) cr...
research
01/25/2023

FewShotTextGCN: K-hop neighborhood regularization for few-shot learning on graphs

We present FewShotTextGCN, a novel method designed to effectively utiliz...

Please sign up or login with your details

Forgot password? Click here to reset