Image-to-Text Translation for Interactive Image Recognition: A Comparative User Study with Non-Expert Users

05/11/2023
by   Wataru Kawabe, et al.
0

Interactive machine learning (IML) allows users to build their custom machine learning models without expert knowledge. While most existing IML systems are designed with classification algorithms, they sometimes oversimplify the capabilities of machine learning algorithms and restrict the user's task definition. On the other hand, as recent large-scale language models have shown, natural language representation has the potential to enable more flexible and generic task descriptions. Models that take images as input and output text have the potential to represent a variety of tasks by providing appropriate text labels for training. However, the effect of introducing text labels to IML system design has never been investigated. In this work, we aim to investigate the difference between image-to-text translation and image classification for IML systems. Using our prototype systems, we conducted a comparative user study with non-expert users, where participants solved various tasks. Our results demonstrate the underlying difficulty for users in properly defining image recognition tasks while highlighting the potential and challenges of interactive image-to-text translation systems.

READ FULL TEXT

page 2

page 4

page 5

page 9

research
03/02/2023

Interactive Text Generation

Users interact with text, image, code, or other editors on a daily basis...
research
04/18/2023

Promptify: Text-to-Image Generation through Interactive Prompt Exploration with Large Language Models

Text-to-image generative models have demonstrated remarkable capabilitie...
research
03/18/2022

But that's not why: Inference adjustment by interactive prototype deselection

Despite significant advances in machine learning, decision-making of art...
research
07/18/2023

PromptMagician: Interactive Prompt Engineering for Text-to-Image Creation

Generative text-to-image models have gained great popularity among the p...
research
10/20/2022

ObSynth: An Interactive Synthesis System for Generating Object Models from Natural Language Specifications

We introduce ObSynth, an interactive system leveraging the domain knowle...
research
04/08/2021

Classification, Slippage, Failure and Discovery

This text argues for the potential of machine learning infused classific...
research
02/08/2021

RECAST: Enabling User Recourse and Interpretability of Toxicity Detection Models with Interactive Visualization

With the widespread use of toxic language online, platforms are increasi...

Please sign up or login with your details

Forgot password? Click here to reset