Effectiveness of French Language Models on Abstractive Dialogue Summarization Task

07/17/2022
by   Yongxin Zhou, et al.
0

Pre-trained language models have established the state-of-the-art on various natural language processing tasks, including dialogue summarization, which allows the reader to quickly access key information from long conversations in meetings, interviews or phone calls. However, such dialogues are still difficult to handle with current models because the spontaneity of the language involves expressions that are rarely present in the corpora used for pre-training the language models. Moreover, the vast majority of the work accomplished in this field has been focused on English. In this work, we present a study on the summarization of spontaneous oral dialogues in French using several language specific pre-trained models: BARThez, and BelGPT-2, as well as multilingual pre-trained models: mBART, mBARThez, and mT5. Experiments were performed on the DECODA (Call Center) dialogue corpus whose task is to generate abstractive synopses from call center conversations between a caller and one or several agents depending on the situation. Results show that the BARThez models offer the best performance far above the previous state-of-the-art on DECODA. We further discuss the limits of such pre-trained models and the challenges that must be addressed for summarizing spontaneous dialogues.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2020

Pre-training Polish Transformer-based Language Models at Scale

Transformer-based language models are now widely used in Natural Languag...
research
12/24/2022

STRUDEL: Structured Dialogue Summarization for Dialogue Comprehension

Abstractive dialogue summarization has long been viewed as an important ...
research
04/22/2021

A Short Survey of Pre-trained Language Models for Conversational AI-A NewAge in NLP

Building a dialogue system that can communicate naturally with humans is...
research
05/26/2021

Language Model as an Annotator: Exploring DialoGPT for Dialogue Summarization

Current dialogue summarization systems usually encode the text with a nu...
research
04/26/2023

Multi-Party Chat: Conversational Agents in Group Settings with Humans and Models

Current dialogue research primarily studies pairwise (two-party) convers...
research
04/09/2022

TANet: Thread-Aware Pretraining for Abstractive Conversational Summarization

Although pre-trained language models (PLMs) have achieved great success ...
research
12/14/2022

Towards mapping the contemporary art world with ArtLM: an art-specific NLP model

With an increasing amount of data in the art world, discovering artists ...

Please sign up or login with your details

Forgot password? Click here to reset