Conformal prediction under ambiguous ground truth

07/18/2023
by   David Stutz, et al.
0

In safety-critical classification tasks, conformal prediction allows to perform rigorous uncertainty quantification by providing confidence sets including the true class with a user-specified probability. This generally assumes the availability of a held-out calibration set with access to ground truth labels. Unfortunately, in many domains, such labels are difficult to obtain and usually approximated by aggregating expert opinions. In fact, this holds true for almost all datasets, including well-known ones such as CIFAR and ImageNet. Applying conformal prediction using such labels underestimates uncertainty. Indeed, when expert opinions are not resolvable, there is inherent ambiguity present in the labels. That is, we do not have “crisp”, definitive ground truth labels and this uncertainty should be taken into account during calibration. In this paper, we develop a conformal prediction framework for such ambiguous ground truth settings which relies on an approximation of the underlying posterior distribution of labels given inputs. We demonstrate our methodology on synthetic and real datasets, including a case study of skin condition classification in dermatology.

READ FULL TEXT

page 15

page 16

page 17

page 18

page 25

page 28

page 29

page 30

research
07/05/2023

Evaluating AI systems under uncertain ground truth: a case study in dermatology

For safety, AI systems in health undergo thorough evaluations before dep...
research
08/20/2023

Unsupervised Opinion Aggregation – A Statistical Perspective

Complex decision-making systems rarely have direct access to the current...
research
07/21/2022

A Forgotten Danger in DNN Supervision Testing: Generating and Detecting True Ambiguity

Deep Neural Networks (DNNs) are becoming a crucial component of modern s...
research
04/12/2021

Spatially Varying Label Smoothing: Capturing Uncertainty from Expert Annotations

The task of image segmentation is inherently noisy due to ambiguities re...
research
06/26/2019

Near Optimal Stratified Sampling

The performance of a machine learning system is usually evaluated by usi...
research
10/28/2020

Independence Tests Without Ground Truth for Noisy Learners

Exact ground truth invariant polynomial systems can be written for arbit...
research
04/10/2021

Deep Weakly Supervised Positioning

PoseNet can map a photo to the position where it is taken, which is appe...

Please sign up or login with your details

Forgot password? Click here to reset