Fast and Scalable Image Search For Histology

by   Chengkuan Chen, et al.

The expanding adoption of digital pathology has enabled the curation of large repositories of histology whole slide images (WSIs), which contain a wealth of information. Similar pathology image search offers the opportunity to comb through large historical repositories of gigapixel WSIs to identify cases with similar morphological features and can be particularly useful for diagnosing rare diseases, identifying similar cases for predicting prognosis, treatment outcomes, and potential clinical trial success. A critical challenge in developing a WSI search and retrieval system is scalability, which is uniquely challenging given the need to search a growing number of slides that each can consist of billions of pixels and are several gigabytes in size. Such systems are typically slow and retrieval speed often scales with the size of the repository they search through, making their clinical adoption tedious and are not feasible for repositories that are constantly growing. Here we present Fast Image Search for Histopathology (FISH), a histology image search pipeline that is infinitely scalable and achieves constant search speed that is independent of the image database size while being interpretable and without requiring detailed annotations. FISH uses self-supervised deep learning to encode meaningful representations from WSIs and a Van Emde Boas tree for fast search, followed by an uncertainty-based ranking algorithm to retrieve similar WSIs. We evaluated FISH on multiple tasks and datasets with over 22,000 patient cases spanning 56 disease subtypes. We additionally demonstrate that FISH can be used to assist with the diagnosis of rare cancer types where sufficient cases may not be available to train traditional supervised deep models. FISH is available as an easy-to-use, open-source software package (


page 8

page 11

page 32

page 33

page 34

page 35

page 36

page 37


Similar Image Search for Histopathology: SMILY

The increasing availability of large institutional and public histopatho...

Self-supervised similarity search for large scientific datasets

We present the use of self-supervised learning to explore and exploit la...

Detection of prostate cancer in whole-slide images through end-to-end training with image-level labels

Prostate cancer is the most prevalent cancer among men in Western countr...

Self-Supervised Representation Learning using Visual Field Expansion on Digital Pathology

The examination of histopathology images is considered to be the gold st...

Scalable Reverse Image Search Engine for NASAWorldview

Researchers often spend weeks sifting through decades of unlabeled satel...

Histopathology Slide Indexing and Search: Are We There Yet?

The search and retrieval of digital histopathology slides is an importan...

Please sign up or login with your details

Forgot password? Click here to reset