Word Embeddings for the Armenian Language: Intrinsic and Extrinsic Evaluation

06/07/2019
by   Karen Avetisyan, et al.
0

In this work, we intrinsically and extrinsically evaluate and compare existing word embedding models for the Armenian language. Alongside, new embeddings are presented, trained using GloVe, fastText, CBOW, SkipGram algorithms. We adapt and use the word analogy task in intrinsic evaluation of embeddings. For extrinsic evaluation, two tasks are employed: morphological tagging and text classification. Tagging is performed on a deep neural network, using ArmTDP v2.3 dataset. For text classification, we propose a corpus of news articles categorized into 7 classes. The datasets are made public to serve as benchmarks for future models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/10/2020

Text classification with word embedding regularization and soft similarity measure

Since the seminal work of Mikolov et al., word embeddings have become th...
research
06/06/2018

The Limitations of Cross-language Word Embeddings Evaluation

The aim of this work is to explore the possible limitations of existing ...
research
05/24/2018

Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms

Many deep learning architectures have been proposed to model the composi...
research
12/14/2022

AsPOS: Assamese Part of Speech Tagger using Deep Learning Approach

Part of Speech (POS) tagging is crucial to Natural Language Processing (...
research
07/11/2021

Document Embedding for Scientific Articles: Efficacy of Word Embeddings vs TFIDF

Over the last few years, neural network derived word embeddings became p...
research
04/17/2021

Are Word Embedding Methods Stable and Should We Care About It?

A representation learning method is considered stable if it consistently...
research
11/15/2020

The Challenge of Diacritics in Yoruba Embeddings

The major contributions of this work include the empirical establishment...

Please sign up or login with your details

Forgot password? Click here to reset