Neural Label Search for Zero-Shot Multi-Lingual Extractive Summarization

04/28/2022
by   Ruipeng Jia, et al.
0

In zero-shot multilingual extractive text summarization, a model is typically trained on English summarization dataset and then applied on summarization datasets of other languages. Given English gold summaries and documents, sentence-level labels for extractive summarization are usually generated using heuristics. However, these monolingual labels created on English datasets may not be optimal on datasets of other languages, for that there is the syntactic or semantic discrepancy between different languages. In this way, it is possible to translate the English dataset to other languages and obtain different sets of labels again using heuristics. To fully leverage the information of these different sets of labels, we propose NLSSum (Neural Label Search for Summarization), which jointly learns hierarchical weights for these different sets of labels together with our summarization model. We conduct multilingual zero-shot summarization experiments on MLSUM and WikiLingua datasets, and we achieve state-of-the-art results using both human and automatic evaluations across these two datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/30/2022

X-SCITLDR: Cross-Lingual Extreme Summarization of Scholarly Documents

The number of scientific publications nowadays is rapidly increasing, ca...
research
06/02/2021

Evaluating the Efficacy of Summarization Evaluation across Languages

While automatic summarization evaluation methods developed for English a...
research
09/26/2022

News Summarization and Evaluation in the Era of GPT-3

The recent success of zero- and few-shot prompting with models like GPT-...
research
09/26/2022

Text Summarization with Oracle Expectation

Extractive summarization produces summaries by identifying and concatena...
research
11/21/2022

Extended Multilingual Protest News Detection – Shared Task 1, CASE 2021 and 2022

We report results of the CASE 2022 Shared Task 1 on Multilingual Protest...
research
10/02/2019

SummAE: Zero-Shot Abstractive Text Summarization using Length-Agnostic Auto-Encoders

We propose an end-to-end neural model for zero-shot abstractive text sum...
research
07/30/2021

Towards Universality in Multilingual Text Rewriting

In this work, we take the first steps towards building a universal rewri...

Please sign up or login with your details

Forgot password? Click here to reset