UNITER-Based Situated Coreference Resolution with Rich Multimodal Input

12/07/2021
by   Yichen Huang, et al.
0

We present our work on the multimodal coreference resolution task of the Situated and Interactive Multimodal Conversation 2.0 (SIMMC 2.0) dataset as a part of the tenth Dialog System Technology Challenge (DSTC10). We propose a UNITER-based model utilizing rich multimodal context such as textual dialog history, object knowledge base and visual dialog scenes to determine whether each object in the current scene is mentioned in the current dialog turn. Results show that the proposed approach outperforms the official DSTC10 baseline substantially, with the object F1 score boosted from 36.6 the development set, demonstrating the effectiveness of the proposed object representations from rich multimodal input. Our model ranks second in the official evaluation on the object coreference resolution task with an F1 score of 73.3

READ FULL TEXT
research
04/29/2020

Multi-View Attention Networks for Visual Dialog

Visual dialog is a challenging vision-language task in which a series of...
research
10/10/2016

Leveraging Recurrent Neural Networks for Multimodal Recognition of Social Norm Violation in Dialog

Social norms are shared rules that govern and facilitate social interact...
research
10/21/2020

TMT: A Transformer-based Modal Translator for Improving Multimodal Sequence Representations in Audio Visual Scene-aware Dialog

Audio Visual Scene-aware Dialog (AVSD) is a task to generate responses w...
research
01/26/2020

Multimodal Data Fusion based on the Global Workspace Theory

We propose a novel neural network architecture, named the Global Workspa...
research
03/16/2022

Spot the Difference: A Cooperative Object-Referring Game in Non-Perfectly Co-Observable Scene

Visual dialog has witnessed great progress after introducing various vis...
research
07/10/2023

SimpleMTOD: A Simple Language Model for Multimodal Task-Oriented Dialogue with Symbolic Scene Representation

SimpleMTOD is a simple language model which recasts several sub-tasks in...
research
05/26/2023

Multimodal Recommendation Dialog with Subjective Preference: A New Challenge and Benchmark

Existing multimodal task-oriented dialog data fails to demonstrate the d...

Please sign up or login with your details

Forgot password? Click here to reset