I2Edit: Towards Multi-turn Interactive Image Editing via Dialogue

by   Xing Cui, et al.

Although there have been considerable research efforts on controllable facial image editing, the desirable interactive setting where the users can interact with the system to adjust their requirements dynamically hasn't been well explored. This paper focuses on facial image editing via dialogue and introduces a new benchmark dataset, Multi-turn Interactive Image Editing (I2Edit), for evaluating image editing quality and interaction ability in real-world interactive facial editing scenarios. The dataset is constructed upon the CelebA-HQ dataset with images annotated with a multi-turn dialogue that corresponds to the user editing requirements. I2Edit is challenging, as it needs to 1) track the dynamically updated user requirements and edit the images accordingly, as well as 2) generate the appropriate natural language response to communicate with the user. To address these challenges, we propose a framework consisting of a dialogue module and an image editing module. The former is for user edit requirements tracking and generating the corresponding indicative responses, while the latter edits the images conditioned on the tracked user edit requirements. In contrast to previous works that simply treat multi-turn interaction as a sequence of single-turn interactions, we extract the user edit requirements from the whole dialogue history instead of the current single turn. The extracted global user edit requirements enable us to directly edit the input raw image to avoid error accumulation and attribute forgetting issues. Extensive quantitative and qualitative experiments on the I2Edit dataset demonstrate the advantage of our proposed framework over the previous single-turn methods. We believe our new dataset could serve as a valuable resource to push forward the exploration of real-world, complex interactive image editing. Code and data will be made public.


page 1

page 2

page 4

page 8

page 12

page 13

page 14

page 15


Sequential Attention GAN for Interactive Image Editing via Dialogue

In this paper, we introduce a new task - interactive image editing via c...

Iterative Interaction Training for Segmentation Editing Networks

Automatic segmentation has great potential to facilitate morphological m...

Talk-to-Edit: Fine-Grained Facial Editing via Dialog

Facial editing is an important task in vision and graphics with numerous...

Adjusting Image Attributes of Localized Regions with Low-level Dialogue

Natural Language Image Editing (NLIE) aims to use natural language instr...

Draft, Command, and Edit: Controllable Text Editing in E-Commerce

Product description generation is a challenging and under-explored task....

Coeditor: Leveraging Contextual Changes for Multi-round Code Auto-editing

Developers often dedicate significant time to maintaining and refactorin...

Seal-3D: Interactive Pixel-Level Editing for Neural Radiance Fields

With the popularity of implicit neural representations, or neural radian...

Please sign up or login with your details

Forgot password? Click here to reset