Emoji Prediction: Extensions and Benchmarking

07/14/2020
by   Weicheng Ma, et al.
0

Emojis are a succinct form of language which can express concrete meanings, emotions, and intentions. Emojis also carry signals that can be used to better understand communicative intent. They have become a ubiquitous part of our daily lives, making them an important part of understanding user-generated content. The emoji prediction task aims at predicting the proper set of emojis associated with a piece of text. Through emoji prediction, models can learn rich representations of the communicative intent of the written text. While existing research on the emoji prediction task focus on a small subset of emoji types closely related to certain emotions, this setting oversimplifies the task and wastes the expressive power of emojis. In this paper, we extend the existing setting of the emoji prediction task to include a richer set of emojis and to allow multi-label classification on the task. We propose novel models for multi-class and multi-label emoji prediction based on Transformer networks. We also construct multiple emoji prediction datasets from Twitter using heuristics. The BERT models achieve state-of-the-art performances on all our datasets under all the settings, with relative improvements of 27.21 236.36 F-1 score, compared to the prior state-of-the-art. Our results demonstrate the efficacy of deep Transformer-based models on the emoji prediction task. We also release our datasets at https://github.com/hikari-NYU/Emoji_Prediction_Datasets_MMS for future researchers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/28/2022

Leveraging Label Correlations in a Multi-label Setting: A Case Study in Emotion

Detecting emotions expressed in text has become critical to a range of f...
research
10/11/2020

Few-shot Learning for Multi-label Intent Detection

In this paper, we study the few-shot multi-label classification for user...
research
06/27/2015

Twitter User Geolocation Using a Unified Text and Network Prediction Model

We propose a label propagation approach to geolocation prediction based ...
research
12/07/2020

Improvements and Extensions on Metaphor Detection

Metaphors are ubiquitous in human language. The metaphor detection task ...
research
05/24/2021

Classifying Math KCs via Task-Adaptive Pre-Trained BERT

Educational content labeled with proper knowledge components (KCs) are p...
research
04/11/2019

Elucidating image-to-set prediction: An analysis of models, losses and datasets

In recent years, we have experienced a flurry of contributions in the mu...
research
07/30/2023

LaFiCMIL: Rethinking Large File Classification from the Perspective of Correlated Multiple Instance Learning

Transformer-based models have revolutionized the performance of a wide r...

Please sign up or login with your details

Forgot password? Click here to reset