Similarities between Arabic Dialects: Investigating Geographical Proximity

05/10/2021
by   Abdulkareem Alsudais, et al.
0

The automatic classification of Arabic dialects is an ongoing research challenge, which has been explored in recent work that defines dialects based on increasingly limited geographic areas like cities and provinces. This paper focuses on a related yet relatively unexplored topic: the effects of the geographical proximity of cities located in Arab countries on their dialectical similarity. Our work is twofold, reliant on: 1) comparing the textual similarities between dialects using cosine similarity and 2) measuring the geographical distance between locations. We study MADAR and NADI, two established datasets with Arabic dialects from many cities and provinces. Our results indicate that cities located in different countries may in fact have more dialectical similarity than cities within the same country, depending on their geographical proximity. The correlation between dialectical similarity and city proximity suggests that cities that are closer together are more likely to share dialectical attributes, regardless of country borders. This nuance provides the potential for important advancements in Arabic dialect research because it indicates that a more granular approach to dialect classification is essential to understanding how to frame the problem of Arabic dialects identification.

READ FULL TEXT
research
03/04/2021

NADI 2021: The Second Nuanced Arabic Dialect Identification Shared Task

We present the findings and results of the Second Nuanced Arabic Dialect...
research
08/28/2017

MIT-QCRI Arabic Dialect Identification System for the 2017 Multi-Genre Broadcast Challenge

In order to successfully annotate the Arabic speech con- tent found in o...
research
03/01/2021

Adapting MARBERT for Improved Arabic Dialect Identification: Submission to the NADI 2021 Shared Task

In this paper, we tackle the Nuanced Arabic Dialect Identification (NADI...
research
12/30/2020

More crime in cities? On the scaling laws of crime and the inadequacy of per capita rankings – a cross-country study

Objectives: To evaluate the relationship between population size and num...
research
05/13/2020

Arabic Dialect Identification in the Wild

We present QADI, an automatically collected dataset of tweets belonging ...
research
09/21/2017

A spatial scientometric analysis of the publication output of cities worldwide

In tandem with the rapid globalisation of science, spatial scientometric...
research
05/26/2022

Do interests affect grant application success? The role of organizational proximity

Bias in grant allocation is a critical issue, as the expectation is that...

Please sign up or login with your details

Forgot password? Click here to reset