Towards trustworthy seizure onset detection using workflow notes

06/14/2023
by   Khaled Saab, et al.
5

A major barrier to deploying healthcare AI models is their trustworthiness. One form of trustworthiness is a model's robustness across different subgroups: while existing models may exhibit expert-level performance on aggregate metrics, they often rely on non-causal features, leading to errors in hidden subgroups. To take a step closer towards trustworthy seizure onset detection from EEG, we propose to leverage annotations that are produced by healthcare personnel in routine clinical workflows – which we refer to as workflow notes – that include multiple event descriptions beyond seizures. Using workflow notes, we first show that by scaling training data to an unprecedented level of 68,920 EEG hours, seizure onset detection performance significantly improves (+12.3 AUROC points) compared to relying on smaller training sets with expensive manual gold-standard labels. Second, we reveal that our binary seizure onset detection model underperforms on clinically relevant subgroups (e.g., up to a margin of 6.5 AUROC points between pediatrics and adults), while having significantly higher false positives on EEG clips showing non-epileptiform abnormalities compared to any EEG clip (+19 FPR points). To improve model robustness to hidden subgroups, we train a multilabel model that classifies 26 attributes other than seizures, such as spikes, slowing, and movement artifacts. We find that our multilabel model significantly improves overall seizure onset detection performance (+5.9 AUROC points) while greatly improving performance among subgroups (up to +8.3 AUROC points), and decreases false positives on non-epileptiform abnormalities by 8 FPR points. Finally, we propose a clinical utility metric based on false positives per 24 EEG hours and find that our multilabel model improves this clinical utility metric by a factor of 2x across different clinical settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/06/2022

On the Importance of Clinical Notes in Multi-modal Learning for EHR Data

Understanding deep learning model behavior is critical to accepting mach...
research
12/28/2017

Automatic Analysis of EEGs Using Big Data and Hybrid Deep Learning Architectures

Objective: A clinical decision support tool that automatically interpret...
research
05/08/2022

Write It Like You See It: Detectable Differences in Clinical Notes By Race Lead To Differential Model Recommendations

Clinical notes are becoming an increasingly important data source for ma...
research
03/19/2019

Machine Learning for removing EEG artifacts: Setting the benchmark

Electroencephalograms (EEG) are often contaminated by artifacts which ma...
research
06/08/2018

Investigating the Impact of CNN Depth on Neonatal Seizure Detection Performance

This study presents a novel, deep, fully convolutional architecture whic...
research
02/24/2022

Validating an SVM-based neonatal seizure detection algorithm for generalizability, non-inferiority and clinical efficacy

Neonatal seizure detection algorithms (SDA) are approaching the benchmar...
research
10/09/2019

Did you miss it? Automatic lung nodule detection combined with gaze information improves radiologists' screening performance

Early diagnosis of lung cancer via computed tomography can significantly...

Please sign up or login with your details

Forgot password? Click here to reset