Multi-Modal Open-Domain Dialogue

10/02/2020
by   Kurt Shuster, et al.
0

Recent work in open-domain conversational agents has demonstrated that significant improvements in model engagingness and humanness metrics can be achieved via massive scaling in both pre-training data and model size (Adiwardana et al., 2020; Roller et al., 2020). However, if we want to build agents with human-like abilities, we must expand beyond handling just text. A particularly important topic is the ability to see images and communicate about what is perceived. With the goal of engaging humans in multi-modal dialogue, we investigate combining components from state-of-the-art open-domain dialogue agents with those from state-of-the-art vision models. We study incorporating different image fusion schemes and domain-adaptive pre-training and fine-tuning strategies, and show that our best resulting model outperforms strong existing models in multi-modal dialogue while simultaneously performing as well as its predecessor (text-only) BlenderBot (Roller et al., 2020) in text-based conversation. We additionally investigate and incorporate safety components in our final model, and show that such efforts do not diminish model performance with respect to engagingness metrics.

READ FULL TEXT

page 1

page 10

page 19

page 20

research
04/15/2021

Retrieval Augmentation Reduces Hallucination in Conversation

Despite showing increasingly human-like conversational abilities, state-...
research
05/24/2023

PaCE: Unified Multi-modal Dialogue Pre-training with Progressive and Compositional Experts

Perceiving multi-modal information and fulfilling dialogues with humans ...
research
10/04/2019

Variational Osmosis for Non-linear Image Fusion

We propose a new variational model for nonlinear image fusion. Our appro...
research
08/05/2022

BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage

We present BlenderBot 3, a 175B parameter dialogue model capable of open...
research
08/17/2019

Build it Break it Fix it for Dialogue Safety: Robustness from Adversarial Human Attack

The detection of offensive language in the context of a dialogue has bec...
research
11/09/2019

The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents

We introduce dodecaDialogue: a set of 12 tasks that measures if a conver...
research
09/06/2018

Training Millions of Personalized Dialogue Agents

Current dialogue systems are not very engaging for users, especially whe...

Please sign up or login with your details

Forgot password? Click here to reset