Cross-Lingual Relevance Transfer for Document Retrieval
Recent work has shown the surprising ability of multi-lingual BERT to serve as a zero-shot cross-lingual transfer model for a number of language processing tasks. We combine this finding with a similarly-recently proposal on sentence-level relevance modeling for document retrieval to demonstrate the ability of multi-lingual BERT to transfer models of relevance across languages. Experiments on test collections in five different languages from diverse language families (Chinese, Arabic, French, Hindi, and Bengali) show that models trained with English data improve ranking quality, without any special processing, both for (non-English) mono-lingual retrieval as well as cross-lingual retrieval.
READ FULL TEXT