μPLAN: Summarizing using a Content Plan as Cross-Lingual Bridge

05/23/2023
by   Fantine Huot, et al.
0

Cross-lingual summarization consists of generating a summary in one language given an input document in a different language, allowing for the dissemination of relevant content across speakers of other languages. However, this task remains challenging, mainly because of the need for cross-lingual datasets and the compounded difficulty of summarizing and translating. This work presents μPLAN, an approach to cross-lingual summarization that uses an intermediate planning step as a cross-lingual bridge. We formulate the plan as a sequence of entities that captures the conceptualization of the summary, i.e. identifying the salient content and expressing in which order to present the information, separate from the surface form. Using a multilingual knowledge base, we align the entities to their canonical designation across languages. μPLAN models first learn to generate the plan and then continue generating the summary conditioned on the plan and the input. We evaluate our methodology on the XWikis dataset on cross-lingual pairs across four languages and demonstrate that this planning objective achieves state-of-the-art performance in terms of ROUGE and faithfulness scores. Moreover, this planning approach improves the zero-shot transfer to new cross-lingual language pairs compared to non-planning baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/23/2022

WikiMulti: a Corpus for Cross-Lingual Summarization

Cross-lingual summarization (CLS) is the task to produce a summary in on...
research
04/04/2023

SimCSum: Joint Learning of Simplification and Cross-lingual Summarization for Cross-lingual Science Journalism

Cross-lingual science journalism generates popular science stories of sc...
research
05/16/2023

Towards Unifying Multi-Lingual and Cross-Lingual Summarization

To adapt text summarization to the multilingual world, previous work pro...
research
05/15/2023

PMIndiaSum: Multilingual and Cross-lingual Headline Summarization for Languages in India

This paper introduces PMIndiaSum, a new multilingual and massively paral...
research
08/14/2018

Cross-Lingual Cross-Platform Rumor Verification Pivoting on Multimedia Content

With the increasing popularity of smart devices, rumors with multimedia ...
research
03/31/2021

A Neighbourhood Framework for Resource-Lean Content Flagging

We propose a novel interpretable framework for cross-lingual content fla...
research
12/06/2022

GAS-Net: Generative Artistic Style Neural Networks for Fonts

Generating new fonts is a time-consuming and labor-intensive, especially...

Please sign up or login with your details

Forgot password? Click here to reset