Towards Human-Friendly Referring Expression Generation

11/29/2018
by   Mikihiro Tanaka, et al.
20

This paper addresses the generation of referring expressions that not only refer to objects correctly but also ease human comprehension. As the composition of an image becomes more complicated and a target becomes relatively less salient, identifying referred objects comes more difficult. However, the existing studies regarded all sentences that refer to objects correctly as equally good, ignoring whether they are easily understood by humans. If the target is not salient, humans utilize relationships with the salient contexts around it to help listeners to comprehend it better. To derive these information from human annotations, our model is designed to extract information from the inside and outside of the target. Moreover, we regard that sentences that are easily understood are those that are comprehended correctly and quickly by humans. We optimized it by using the time required to locate the referred objects by humans and their accuracies. To evaluate our system, we created a new referring expression dataset whose images were acquired from Grand Theft Auto V (GTA V), limiting targets to persons. Our proposed method outperformed previous methods both on machine evaluation and on crowd-sourced human evaluation. The source code and dataset will be available soon.

READ FULL TEXT

page 1

page 3

page 5

page 6

page 8

research
07/31/2016

Modeling Context in Referring Expressions

Humans refer to objects in their environments all the time, especially i...
research
01/12/2017

Comprehension-guided referring expressions

We consider generation and comprehension of natural language referring e...
research
08/19/2023

Whether you can locate or not? Interactive Referring Expression Generation

Referring Expression Generation (REG) aims to generate unambiguous Refer...
research
05/26/2018

Using Syntax to Ground Referring Expressions in Natural Images

We introduce GroundNet, a neural network for referring expression recogn...
research
06/02/2020

Give Me Something to Eat: Referring Expression Comprehension with Commonsense Knowledge

Conventional referring expression comprehension (REF) assumes people to ...
research
08/03/2019

Searching for Ambiguous Objects in Videos using Relational Referring Expressions

Humans frequently use referring (identifying) expressions to refer to ob...
research
03/09/2018

DeepMoTIon: Learning to Navigate Like Humans

We present a novel human-aware navigation approach, where the robot lear...

Please sign up or login with your details

Forgot password? Click here to reset