Cycle-Consistency for Robust Visual Question Answering

02/15/2019
by   Meet Shah, et al.
14

Despite significant progress in Visual Question Answering over the years, robustness of today's VQA models leave much to be desired. We introduce a new evaluation protocol and associated dataset (VQA-Rephrasings) and show that state-of-the-art VQA models are notoriously brittle to linguistic variations in questions. VQA-Rephrasings contains 3 human-provided rephrasings for 40k questions spanning 40k images from the VQA v2.0 validation dataset. As a step towards improving robustness of VQA models, we propose a model-agnostic framework that exploits cycle consistency. Specifically, we train a model to not only answer a question, but also generate a question conditioned on the answer, such that the answer predicted for the generated question is the same as the ground truth answer to the original question. Without the use of additional annotations, we show that our approach is significantly more robust to linguistic variations than state-of-the-art VQA models, when evaluated on the VQA-Rephrasings dataset. In addition, our approach outperforms state-of-the-art approaches on the standard VQA and Visual Question Generation tasks on the challenging VQA v2.0 dataset.

READ FULL TEXT

page 1

page 5

page 7

page 12

page 14

page 15

page 16

research
07/08/2020

IQ-VQA: Intelligent Visual Question Answering

Even though there has been tremendous progress in the field of Visual Qu...
research
12/16/2019

Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing

Despite significant success in Visual Question Answering (VQA), VQA mode...
research
11/26/2020

Learning from Lexical Perturbations for Consistent Visual Question Answering

Existing Visual Question Answering (VQA) models are often fragile and se...
research
02/17/2020

CQ-VQA: Visual Question Answering on Categorized Questions

This paper proposes CQ-VQA, a novel 2-level hierarchical but end-to-end ...
research
06/04/2021

Human-Adversarial Visual Question Answering

Performance on the most commonly used Visual Question Answering dataset ...
research
09/10/2019

Sunny and Dark Outside?! Improving Answer Consistency in VQA through Entailed Question Generation

While models for Visual Question Answering (VQA) have steadily improved ...
research
04/18/2019

Towards VQA Models that can Read

Studies have shown that a dominant class of questions asked by visually ...

Please sign up or login with your details

Forgot password? Click here to reset