EveTAR: Building a Large-Scale Multi-Task Test Collection over Arabic Tweets

by   Maram Hasanain, et al.

This article introduces a new language-independent approach for creating a large-scale high-quality test collection of tweets that supports multiple information retrieval (IR) tasks without running a shared-task campaign. The adopted approach (demonstrated over Arabic tweets) designs the collection around significant (i.e., popular) events, which enables the development of topics that represent frequent information needs of Twitter users for which rich content exists. That inherently facilitates the support of multiple tasks that generally revolve around events, namely event detection, ad-hoc search, timeline generation, and real-time summarization. The key highlights of the approach include diversifying the judgment pool via interactive search and multiple manually-crafted queries per topic, collecting high-quality annotations via crowd-workers for relevancy and in-house annotators for novelty, filtering out low-agreement topics and inaccessible tweets, and providing multiple subsets of the collection for better availability. Applying our methodology on Arabic tweets resulted in EveTAR , the first freely-available tweet test collection for multiple IR tasks. EveTAR includes a crawl of 355M Arabic tweets and covers 50 significant events for which about 62K tweets were judged with substantial average inter-annotator agreement (Kappa value of 0.71). We demonstrate the usability of EveTAR by evaluating existing algorithms in the respective tasks. Results indicate that the new collection can support reliable ranking of IR systems that is comparable to similar TREC collections, while providing strong baseline results for future studies over Arabic tweets.


page 1

page 2

page 3

page 4


ArCOV-19: The First Arabic COVID-19 Twitter Dataset with Propagation Networks

In this paper, we present ArCOV-19, an Arabic COVID-19 Twitter dataset t...

Arabic Offensive Language on Twitter: Analysis and Experiments

Detecting offensive language on Twitter has many applications ranging fr...

Build Fast and Accurate Lemmatization for Arabic

In this paper we describe the complexity of building a lemmatizer for Ar...

ArCorona: Analyzing Arabic Tweets in the Early Days of Coronavirus (COVID-19) Pandemic

Over the past few months, there were huge numbers of circulating tweets ...

An Information Retrieval Approach to Building Datasets for Hate Speech Detection

Building a benchmark dataset for hate speech detection presents several ...

NAYEL at SemEval-2020 Task 12: TF/IDF-Based Approach for Automatic Offensive Language Detection in Arabic Tweets

In this paper, we present the system submitted to "SemEval-2020 Task 12"...

Understanding collective human movement dynamics during large-scale events using big geosocial data analytics

With the rapid advancement of information and communication technologies...

Please sign up or login with your details

Forgot password? Click here to reset