Towards Reasoning-Aware Explainable VQA

11/09/2022
by   Rakesh Vaideeswaran, et al.
0

The domain of joint vision-language understanding, especially in the context of reasoning in Visual Question Answering (VQA) models, has garnered significant attention in the recent past. While most of the existing VQA models focus on improving the accuracy of VQA, the way models arrive at an answer is oftentimes a black box. As a step towards making the VQA task more explainable and interpretable, our method is built upon the SOTA VQA framework by augmenting it with an end-to-end explanation generation module. In this paper, we investigate two network architectures, including Long Short-Term Memory (LSTM) and Transformer decoder, as the explanation generator. Our method generates human-readable textual explanations while maintaining SOTA VQA accuracy on the GQA-REX (77.49 65.16 60.5

READ FULL TEXT

page 1

page 3

page 5

page 9

page 10

page 11

page 12

research
03/20/2018

VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions

Most existing works in visual question answering (VQA) are dedicated to ...
research
05/24/2023

Interpretable by Design Visual Question Answering

Model interpretability has long been a hard problem for the AI community...
research
10/06/2021

Coarse-to-Fine Reasoning for Visual Question Answering

Bridging the semantic gap between image and question is an important ste...
research
10/29/2018

Do Explanations make VQA Models more Predictable to a Human?

A rich line of research attempts to make deep neural networks more trans...
research
03/01/2020

A Study on Multimodal and Interactive Explanations for Visual Question Answering

Explainability and interpretability of AI models is an essential factor ...
research
07/02/2020

The Impact of Explanations on AI Competency Prediction in VQA

Explainability is one of the key elements for building trust in AI syste...
research
06/22/2022

VisFIS: Visual Feature Importance Supervision with Right-for-the-Right-Reason Objectives

Many past works aim to improve visual reasoning in models by supervising...

Please sign up or login with your details

Forgot password? Click here to reset