Adaptive Testing of Computer Vision Models

12/06/2022
by   Irena Gao, et al.
0

Vision models often fail systematically on groups of data that share common semantic characteristics (e.g., rare objects or unusual scenes), but identifying these failure modes is a challenge. We introduce AdaVision, an interactive process for testing vision models which helps users identify and fix coherent failure modes. Given a natural language description of a coherent group, AdaVision retrieves relevant images from LAION-5B with CLIP. The user then labels a small amount of data for model correctness, which is used in successive retrieval rounds to hill-climb towards high-error regions, refining the group definition. Once a group is saturated, AdaVision uses GPT-3 to suggest new group descriptions for the user to explore. We demonstrate the usefulness and generality of AdaVision in user studies, where users find major bugs in state-of-the-art classification, object detection, and image captioning models. These user-discovered groups have failure rates 2-3x higher than those surfaced by automatic error clustering methods. Finally, finetuning on examples found with AdaVision fixes the discovered bugs when evaluated on unseen examples, without degrading in-distribution accuracy, and while also improving performance on out-of-distribution datasets.

READ FULL TEXT
research
10/11/2022

SEAL : Interactive Tool for Systematic Error Analysis and Labeling

With the advent of Transformers, large language models (LLMs) have satur...
research
02/08/2022

Causal Scene BERT: Improving object detection by searching for challenging groups of data

Modern computer vision applications rely on learning-based perception mo...
research
10/29/2020

Understanding the Failure Modes of Out-of-Distribution Generalization

Empirical studies suggest that machine learning models often rely on fea...
research
09/05/2023

NICE 2023 Zero-shot Image Captioning Challenge

In this report, we introduce NICE project[<https://nice.lgresearch.ai/>]...
research
11/06/2018

Semantic bottleneck for computer vision tasks

This paper introduces a novel method for the representation of images th...
research
05/08/2023

Distribution-aware Fairness Test Generation

This work addresses how to validate group fairness in image recognition ...
research
10/25/2021

Identifying and Benchmarking Natural Out-of-Context Prediction Problems

Deep learning systems frequently fail at out-of-context (OOC) prediction...

Please sign up or login with your details

Forgot password? Click here to reset