CoDraw: Visual Dialog for Collaborative Drawing

12/15/2017
by   Jin-Hwa Kim, et al.
0

In this work, we propose a goal-driven collaborative task that contains vision, language, and action in a virtual environment as its core components. Specifically, we develop a collaborative `Image Drawing' game between two agents, called CoDraw. Our game is grounded in a virtual world that contains movable clip art objects. Two players, Teller and Drawer, are involved. The Teller sees an abstract scene containing multiple clip arts in a semantically meaningful configuration, while the Drawer tries to reconstruct the scene on an empty canvas using available clip arts. The two players communicate via two-way communication using natural language. We collect the CoDraw dataset of 10K dialogs consisting of 138K messages exchanged between a Teller and a Drawer from Amazon Mechanical Turk (AMT). We analyze our dataset and present three models to model the players' behaviors, including an attention model to describe and draw multiple clip arts at each round. The attention models are quantitatively compared to the other models to show how the conventional approaches work for this new task. We also present qualitative visualizations.

READ FULL TEXT

page 3

page 8

page 12

page 18

page 19

page 20

page 21

page 22

research
12/01/2021

Iconary: A Pictionary-Based Game for Testing Multimodal Communication with Drawings and Text

Communicating with humans is challenging for AIs because it requires a s...
research
07/17/2020

iNNk: A Multi-Player Game to Deceive a Neural Network

This paper presents iNNK, a multiplayer drawing game where human players...
research
03/16/2022

Spot the Difference: A Cooperative Object-Referring Game in Non-Perfectly Co-Observable Scene

Visual dialog has witnessed great progress after introducing various vis...
research
06/03/2021

Learning to Draw: Emergent Communication through Sketching

Evidence that visual communication preceded written language and provide...
research
06/27/2021

Draw Me a Flower: Grounding Formal Abstract Structures Stated in Informal Natural Language

Forming and interpreting abstraction is a core process in human communic...
research
08/09/2017

Personalized Cinemagraphs using Semantic Understanding and Collaborative Learning

Cinemagraphs are a compelling way to convey dynamic aspects of a scene. ...
research
04/23/2017

Translating Neuralese

Several approaches have recently been proposed for learning decentralize...

Please sign up or login with your details

Forgot password? Click here to reset