Egoshots, an ego-vision life-logging dataset and semantic fidelity metric to evaluate diversity in image captioning models

03/26/2020
by   Pranav Agarwal, et al.
5

Image captioning models have been able to generate grammatically correct and human understandable sentences. However most of the captions convey limited information as the model used is trained on datasets that do not caption all possible objects existing in everyday life. Due to this lack of prior information most of the captions are biased to only a few objects present in the scene, hence limiting their usage in daily life. In this paper, we attempt to show the biased nature of the currently existing image captioning models and present a new image captioning dataset, Egoshots, consisting of 978 real life images with no captions. We further exploit the state of the art pre-trained image captioning and object recognition networks to annotate our images and show the limitations of existing works. Furthermore, in order to evaluate the quality of the generated captions, we propose a new image captioning metric, object based Semantic Fidelity (SF). Existing image captioning metrics can evaluate a caption only in the presence of their corresponding annotations; however, SF allows evaluating captions generated for images without annotations, making it highly useful for real life generated captions.

READ FULL TEXT

page 2

page 12

page 13

page 14

research
06/26/2021

UMIC: An Unreferenced Metric for Image Captioning via Contrastive Learning

Despite the success of various text generation metrics such as BERTScore...
research
03/28/2019

Describing like humans: on diversity in image captioning

Recently, the state-of-the-art models for image captioning have overtake...
research
06/20/2021

Exploring Semantic Relationships for Unpaired Image Captioning

Recently, image captioning has aroused great interest in both academic a...
research
05/06/2019

Image Captioning with Clause-Focused Metrics in a Multi-Modal Setting for Marketing

Automatically generating descriptive captions for images is a well-resea...
research
10/25/2018

Engaging Image Captioning Via Personality

Standard image captioning tasks such as COCO and Flickr30k are factual, ...
research
05/28/2022

BAN-Cap: A Multi-Purpose English-Bangla Image Descriptions Dataset

As computers have become efficient at understanding visual information a...
research
05/31/2016

Attention Correctness in Neural Image Captioning

Attention mechanisms have recently been introduced in deep learning for ...

Please sign up or login with your details

Forgot password? Click here to reset