On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering

02/24/2020
by   Xinyu Wang, et al.
12

Visual Question Answering (VQA) methods have made incredible progress, but suffer from a failure to generalize. This is visible in the fact that they are vulnerable to learning coincidental correlations in the data rather than deeper relations between image content and ideas expressed in language. We present a dataset that takes a step towards addressing this problem in that it contains questions expressed in two languages, and an evaluation process that co-opts a well understood image-based metric to reflect the method's ability to reason. Measuring reasoning directly encourages generalization by penalizing answers that are coincidentally correct. The dataset reflects the scene-text version of the VQA problem, and the reasoning evaluation can be seen as a text-based version of a referring expression challenge. Experiments and analysis are provided that show the value of the dataset.

READ FULL TEXT

page 1

page 3

page 4

page 6

page 9

research
05/31/2019

Scene Text Visual Question Answering

Current visual question answering datasets do not consider the rich sema...
research
09/14/2022

MUST-VQA: MUltilingual Scene-text VQA

In this paper, we present a framework for Multilingual Scene Text Visual...
research
05/17/2022

Gender and Racial Bias in Visual Question Answering Datasets

Vision-and-language tasks have increasingly drawn more attention as a me...
research
06/30/2019

ICDAR 2019 Competition on Scene Text Visual Question Answering

This paper presents final results of ICDAR 2019 Scene Text Visual Questi...
research
12/15/2021

3D Question Answering

Visual Question Answering (VQA) has witnessed tremendous progress in rec...
research
06/06/2022

Invariant Grounding for Video Question Answering

Video Question Answering (VideoQA) is the task of answering questions ab...
research
03/26/2020

P ≈ NP, at least in Visual Question Answering

In recent years, progress in the Visual Question Answering (VQA) field h...

Please sign up or login with your details

Forgot password? Click here to reset