The Cross-Lingual Arabic Information REtrieval (CLAIRE) System

07/29/2021
by   Zhizhong Chen, et al.
0

Despite advances in neural machine translation, cross-lingual retrieval tasks in which queries and documents live in different natural language spaces remain challenging. Although neural translation models may provide an intuitive approach to tackle the cross-lingual problem, their resource-consuming training and advanced model structures may complicate the overall retrieval pipeline and reduce users engagement. In this paper, we build our end-to-end Cross-Lingual Arabic Information REtrieval (CLAIRE) system based on the cross-lingual word embedding where searchers are assumed to have a passable passive understanding of Arabic and various supporting information in English is provided to aid retrieval experience. The proposed system has three major advantages: (1) The usage of English-Arabic word embedding simplifies the overall pipeline and avoids the potential mistakes caused by machine translation. (2) Our CLAIRE system can incorporate arbitrary word embedding-based neural retrieval models without structural modification. (3) Early empirical results on an Arabic news collection show promising performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2019

Improving Low-Resource Cross-lingual Document Retrieval by Reranking with Deep Bilingual Representations

In this paper, we propose to boost low-resource cross-lingual document r...
research
05/26/2020

A Study of Neural Matching Models for Cross-lingual IR

In this study, we investigate interaction-based neural matching models f...
research
05/02/2018

Unsupervised Cross-Lingual Information Retrieval using Monolingual Data Only

We propose a fully unsupervised framework for ad-hoc cross-lingual infor...
research
04/16/2021

"Wikily" Neural Machine Translation Tailored to Cross-Lingual Tasks

We present a simple but effective approach for leveraging Wikipedia for ...
research
10/11/2022

IsoVec: Controlling the Relative Isomorphism of Word Embedding Spaces

The ability to extract high-quality translation dictionaries from monoli...
research
10/26/2020

Constraint Translation Candidates: A Bridge between Neural Query Translation and Cross-lingual Information Retrieval

Query translation (QT) is a key component in cross-lingual information r...
research
04/12/2018

Learning Multilingual Embeddings for Cross-Lingual Information Retrieval in the Presence of Topically Aligned Corpora

Cross-lingual information retrieval is a challenging task in the absence...

Please sign up or login with your details

Forgot password? Click here to reset