Surgical Aggregation: A Federated Learning Framework for Harmonizing Distributed Datasets with Diverse Tasks

01/17/2023
by   Pranav Kulkarni, et al.
12

AI-assisted characterization of chest x-rays (CXR) has the potential to provide substantial benefits across many clinical applications. Many large-scale public CXR datasets have been curated for detection of abnormalities using deep learning. However, each of these datasets focus on detecting a subset of disease labels that could be present in a CXR, thus limiting their clinical utility. Furthermore, the distributed nature of these datasets, along with data sharing regulations, make it difficult to share and create a complete representation of disease labels. We propose surgical aggregation, a federated learning framework for aggregating knowledge from distributed datasets with different disease labels into a 'global' deep learning model. We randomly divided the NIH Chest X-Ray 14 dataset into training (70 and conducted two experiments. In the first experiment, we pruned the disease labels to create two 'toy' datasets containing 11 and 8 labels respectively with 4 overlapping labels. For the second experiment, we pruned the disease labels to create two disjoint 'toy' datasets with 7 labels each. We observed that the surgically aggregated 'global' model resulted in excellent performance across both experiments when compared to a 'baseline' model trained on complete disease labels. The overlapping and disjoint experiments had an AUROC of 0.87 and 0.86 respectively, compared to the baseline AUROC of 0.87. We used surgical aggregation to harmonize the NIH Chest X-Ray 14 and CheXpert datasets into a 'global' model with an AUROC of 0.85 and 0.83 respectively. Our results show that surgical aggregation could be used to develop clinically useful deep learning models by aggregating knowledge from distributed datasets with diverse tasks, a step forward towards bridging the gap from bench to bedside.

READ FULL TEXT
research
03/10/2023

Optimizing Federated Learning for Medical Image Classification on Distributed Non-iid Datasets with Partial Labels

Numerous large-scale chest x-ray datasets have spearheaded expert-level ...
research
11/11/2022

From Competition to Collaboration: Making Toy Datasets on Kaggle Clinically Useful for Chest X-Ray Diagnosis Using Federated Learning

Chest X-ray (CXR) datasets hosted on Kaggle, though useful from a data s...
research
12/05/2022

FedCC: Robust Federated Learning against Model Poisoning Attacks

Federated Learning has emerged to cope with raising concerns about priva...
research
02/07/2020

Quantifying the Value of Lateral Views in Deep Learning for Chest X-rays

Most deep learning models in chest X-ray prediction utilize the posteroa...
research
07/30/2019

Exploring large scale public medical image datasets

Rationale and Objectives: Medical artificial intelligence systems are de...

Please sign up or login with your details

Forgot password? Click here to reset