Semi-Automatic Data Annotation, POS Tagging and Mildly Context-Sensitive Disambiguation: the eXtended Revised AraMorph (XRAM)

03/06/2016
by   Giuliano Lancioni, et al.
0

An extended, revised form of Tim Buckwalter's Arabic lexical and morphological resource AraMorph, eXtended Revised AraMorph (henceforth XRAM), is presented which addresses a number of weaknesses and inconsistencies of the original model by allowing a wider coverage of real-world Classical and contemporary (both formal and informal) Arabic texts. Building upon previous research, XRAM enhancements include (i) flag-selectable usage markers, (ii) probabilistic mildly context-sensitive POS tagging, filtering, disambiguation and ranking of alternative morphological analyses, (iii) semi-automatic increment of lexical coverage through extraction of lexical and morphological information from existing lexical resources. Testing of XRAM through a front-end Python module showed a remarkable success level.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/16/2021

Automatic Error Type Annotation for Arabic

We present ARETA, an automatic error type annotation system for Modern S...
research
10/05/2019

Joint Diacritization, Lemmatization, Normalization, and Fine-Grained Morphological Tagging

Semitic languages can be highly ambiguous, having several interpretation...
research
05/10/2019

Restoring Arabic vowels through omission-tolerant dictionary lookup

Vowels in Arabic are optional orthographic symbols written as diacritics...
research
08/25/2018

MADARi: A Web Interface for Joint Arabic Morphological Annotation and Spelling Correction

In this paper, we introduce MADARi, a joint morphological annotation and...
research
10/28/2019

Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling

Morphological tagging is challenging for morphologically rich languages ...
research
06/14/2021

Contemporary Amharic Corpus: Automatically Morpho-Syntactically Tagged Amharic Corpus

We introduced the contemporary Amharic corpus, which is automatically ta...
research
07/21/2019

Augmenting a BiLSTM tagger with a Morphological Lexicon and a Lexical Category Identification Step

Previous work on using BiLSTM models for PoS tagging has primarily focus...

Please sign up or login with your details

Forgot password? Click here to reset