A Joint Study of Phrase Grounding and Task Performance in Vision and Language Models

09/06/2023
by   Noriyuki Kojima, et al.
0

Key to tasks that require reasoning about natural language in visual contexts is grounding words and phrases to image regions. However, observing this grounding in contemporary models is complex, even if it is generally expected to take place if the task is addressed in a way that is conductive to generalization. We propose a framework to jointly study task performance and phrase grounding, and propose three benchmarks to study the relation between the two. Our results show that contemporary models demonstrate inconsistency between their ability to ground phrases and solve tasks. We show how this can be addressed through brute-force training on ground phrasing annotations, and analyze the dynamics it creates. Code and at available at https://github.com/lil-lab/phrase_grounding.

READ FULL TEXT

page 2

page 6

page 13

page 20

page 21

page 22

page 26

page 27

research
10/23/2022

Extending Phrase Grounding with Pronouns in Visual Dialogues

Conventional phrase grounding aims to localize noun phrases mentioned in...
research
07/21/2023

Advancing Visual Grounding with Scene Knowledge: Benchmark and Method

Visual grounding (VG) aims to establish fine-grained alignment between v...
research
03/21/2023

Joint Visual Grounding and Tracking with Natural Language Specification

Tracking by natural language specification aims to locate the referred t...
research
12/31/2021

Deconfounded Visual Grounding

We focus on the confounding bias between language and location in the vi...
research
09/10/2021

Panoptic Narrative Grounding

This paper proposes Panoptic Narrative Grounding, a spatially fine and g...
research
03/18/2023

Grounding 3D Object Affordance from 2D Interactions in Images

Grounding 3D object affordance seeks to locate objects' ”action possibil...
research
10/21/2022

Describing Sets of Images with Textual-PCA

We seek to semantically describe a set of images, capturing both the att...

Please sign up or login with your details

Forgot password? Click here to reset