A Large Self-Annotated Corpus for Sarcasm

04/19/2017
by   Mikhail Khodak, et al.
0

We introduce the Self-Annotated Reddit Corpus (SARC), a large corpus for sarcasm research and for training and evaluating systems for sarcasm detection. The corpus has 1.3 million sarcastic statements -- 10 times more than any previous dataset -- and many times more instances of non-sarcastic statements, allowing for learning in regimes of both balanced and unbalanced labels. Each statement is furthermore self-annotated -- sarcasm is labeled by the author and not an independent annotator -- and provided with user, topic, and conversation context. We evaluate the corpus for accuracy, compare it to previous related corpora, and provide baselines for the task of sarcasm detection.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/13/2016

Predicting the Effectiveness of Self-Training: Application to Sentiment Classification

The goal of this paper is to investigate the connection between the perf...
research
04/06/2022

A New Dataset for Topic-Based Paragraph Classification in Genocide-Related Court Transcripts

Recent progress in natural language processing has been impressive in ma...
research
08/29/2019

Scientific Statement Classification over arXiv.org

We introduce a new classification task for scientific statements and rel...
research
11/26/2019

Convolutional Composer Classification

This paper investigates end-to-end learnable models for attributing comp...
research
03/13/2020

WAC: A Corpus of Wikipedia Conversations for Online Abuse Detection

With the spread of online social networks, it is more and more difficult...
research
04/20/2022

Who Is Missing? Characterizing the Participation of Different Demographic Groups in a Korean Nationwide Daily Conversation Corpus

A conversation corpus is essential to build interactive AI applications....
research
06/03/2022

ArgRewrite V.2: an Annotated Argumentative Revisions Corpus

Analyzing how humans revise their writings is an interesting research qu...

Please sign up or login with your details

Forgot password? Click here to reset