Length-Controllable Image Captioning

07/19/2020
by   Chaorui Deng, et al.
The University of Adelaide
South China University of Technology International Student Union
12

The last decade has witnessed remarkable progress in the image captioning task; however, most existing methods cannot control their captions, e.g., choosing to describe the image either roughly or in detail. In this paper, we propose to use a simple length level embedding to endow them with this ability. Moreover, due to their autoregressive nature, the computational complexity of existing models increases linearly as the length of the generated captions grows. Thus, we further devise a non-autoregressive image captioning approach that can generate captions in a length-irrelevant complexity. We verify the merit of the proposed length level embedding on three models: two state-of-the-art (SOTA) autoregressive models with different types of decoder, as well as our proposed non-autoregressive model, to show its generalization ability. In the experiments, our length-controllable image captioning models not only achieve SOTA performance on the challenging MS COCO dataset but also generate length-controllable and diverse image captions. Specifically, our non-autoregressive model outperforms the autoregressive baselines in terms of controllability and diversity, and also significantly improves the decoding efficiency for long captions. Our code and models are released at magentahttps://github.com/bearcatt/LaBERT.

READ FULL TEXT

page 1

page 2

page 3

page 4

05/29/2020

Controlling Length in Image Captioning

We develop and evaluate captioning models that allow control of caption ...
12/04/2022

Controllable Image Captioning via Prompting

Despite the remarkable progress of image captioning, existing captioners...
11/27/2022

CLID: Controlled-Length Image Descriptions with Limited Data

Controllable image captioning models generate human-like image descripti...
03/22/2021

Human-like Controllable Image Captioning with Verb-specific Semantic Roles

Controllable Image Captioning (CIC) – generating image descriptions foll...
04/28/2022

Controllable Image Captioning

State-of-the-art image captioners can generate accurate sentences to des...
11/27/2019

Non-Autoregressive Video Captioning with Iterative Refinement

Existing state-of-the-art autoregressive video captioning methods (ARVC)...
04/15/2018

Pragmatically Informative Image Captioning with Character-Level Reference

We combine a neural image captioner with a Rational Speech Acts (RSA) mo...

Code Repositories

LaBERT

A length-controllable and non-autoregressive image captioning model.


view repo

Please sign up or login with your details

Forgot password? Click here to reset