A General-Purpose Self-Supervised Model for Computational Pathology

by   Richard J. Chen, et al.

Tissue phenotyping is a fundamental computational pathology (CPath) task in learning objective characterizations of histopathologic biomarkers in anatomic pathology. However, whole-slide imaging (WSI) poses a complex computer vision problem in which the large-scale image resolutions of WSIs and the enormous diversity of morphological phenotypes preclude large-scale data annotation. Current efforts have proposed using pretrained image encoders with either transfer learning from natural image datasets or self-supervised pretraining on publicly-available histopathology datasets, but have not been extensively developed and evaluated across diverse tissue types at scale. We introduce UNI, a general-purpose self-supervised model for pathology, pretrained using over 100 million tissue patches from over 100,000 diagnostic haematoxylin and eosin-stained WSIs across 20 major tissue types, and evaluated on 33 representative CPath clinical tasks in CPath of varying diagnostic difficulties. In addition to outperforming previous state-of-the-art models, we demonstrate new modeling capabilities in CPath such as resolution-agnostic tissue classification, slide classification using few-shot class prototypes, and disease subtyping generalization in classifying up to 108 cancer types in the OncoTree code classification system. UNI advances unsupervised representation learning at scale in CPath in terms of both pretraining data and downstream evaluation, enabling data-efficient AI models that can generalize and transfer to a gamut of diagnostically-challenging tasks and clinical workflows in anatomic pathology.


page 1

page 4

page 9

page 12

page 17

page 40

page 42


Self-Supervised Vision Transformers Learn Visual Concepts in Histopathology

Tissue phenotyping is a fundamental task in learning objective character...

Virchow: A Million-Slide Digital Pathology Foundation Model

Computational pathology uses artificial intelligence to enable precision...

Scaling Vision Transformers to Gigapixel Images via Hierarchical Self-Supervised Learning

Vision Transformers (ViTs) and their multi-scale and hierarchical variat...

Pretrained Encoders are All You Need

Data-efficiency and generalization are key challenges in deep learning a...

Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition

Current speaker recognition systems primarily rely on supervised approac...

Hierarchical discriminative learning improves visual representations of biomedical microscopy

Learning high-quality, self-supervised, visual representations is essent...

DINOv2: Learning Robust Visual Features without Supervision

The recent breakthroughs in natural language processing for model pretra...

Please sign up or login with your details

Forgot password? Click here to reset