An Ensemble Method for Producing Word Representations for the Greek Language

12/10/2019
by   Michalis Lioudakis, et al.
0

In this paper we present a new ensemble method, Continuous Bag-of-Skip-grams (CBOS), that produces high-quality word representations for the Greek language. The CBOS method combines the pioneering approaches for learning word representations: Continuous Bag-of-Words (CBOW) and Continuous Skip-gram. These methods are compared through a word analogy task on three different sources of data: the English Wikipedia corpus, the Greek Wikipedia corpus, and the Greek Web Content corpus. By comparing these methods across different datasets, it is evident that the CBOS method achieves state-of-the-art performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/18/2017

Paraphrasing verbal metonymy through computational methods

Verbal metonymy has received relatively scarce attention in the field of...
research
11/28/2019

A New Corpus for Low-Resourced Sindhi Language with Word Embeddings

Representing words and phrases into dense vectors of real numbers which ...
research
07/15/2016

Enriching Word Vectors with Subword Information

Continuous word representations, trained on large unlabeled corpora are ...
research
12/02/2016

Alleviating Overfitting for Polysemous Words for Word Representation Estimation Using Lexicons

Though there are some works on improving distributed word representation...
research
10/16/2013

Distributed Representations of Words and Phrases and their Compositionality

The recently introduced continuous Skip-gram model is an efficient metho...
research
02/06/2014

An Autoencoder Approach to Learning Bilingual Word Representations

Cross-language learning allows us to use training data from one language...
research
04/08/2019

Crosslingual Document Embedding as Reduced-Rank Ridge Regression

There has recently been much interest in extending vector-based word rep...

Please sign up or login with your details

Forgot password? Click here to reset