Bianet: A Parallel News Corpus in Turkish, Kurdish and English

05/14/2018
by   Duygu Ataman, et al.
0

We present a new open-source parallel corpus consisting of news articles collected from the Bianet magazine, an online newspaper that publishes Turkish news, often along with their translations in English and Kurdish. In this paper, we describe the collection process of the corpus and its statistical properties. We validate the benefit of using the Bianet corpus by evaluating bilingual and multilingual neural machine translation models in English-Turkish and English-Kurdish directions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/24/2019

Automatic Parallel Corpus Creation for Hindi-English News Translation Task

The parallel corpus for multilingual NLP tasks, deep learning applicatio...
research
05/29/2023

A Corpus for Sentence-level Subjectivity Detection on English News Articles

We present a novel corpus for subjectivity detection at the sentence lev...
research
01/14/2022

Multilingual Open Text 1.0: Public Domain News in 44 Languages

We present a new multilingual corpus containing text in 44 languages, ma...
research
08/21/2018

ISNA-Set: A novel English Corpus of Iran NEWS

News agencies publish news on their websites all over the world. Moreove...
research
03/13/2017

A Visual Representation of Wittgenstein's Tractatus Logico-Philosophicus

In this paper we present a data visualization method together with its p...
research
08/13/2021

MIND - Mainstream and Independent News Documents Corpus

This paper presents and characterizes MIND, a new Portuguese corpus comp...
research
02/06/2021

From Toxicity in Online Comments to Incivility in American News: Proceed with Caution

The ability to quantify incivility online, in news and in congressional ...

Please sign up or login with your details

Forgot password? Click here to reset