Few-Shot Electronic Health Record Coding through Graph Contrastive Learning

by   Shanshan Wang, et al.

Electronic health record (EHR) coding is the task of assigning ICD codes to each EHR. Most previous studies either only focus on the frequent ICD codes or treat rare and frequent ICD codes in the same way. These methods perform well on frequent ICD codes but due to the extremely unbalanced distribution of ICD codes, the performance on rare ones is far from satisfactory. We seek to improve the performance for both frequent and rare ICD codes by using a contrastive graph-based EHR coding framework, CoGraph, which re-casts EHR coding as a few-shot learning task. First, we construct a heterogeneous EHR word-entity (HEWE) graph for each EHR, where the words and entities extracted from an EHR serve as nodes and the relations between them serve as edges. Then, CoGraph learns similarities and dissimilarities between HEWE graphs from different ICD codes so that information can be transferred among them. In a few-shot learning scenario, the model only has access to frequent ICD codes during training, which might force it to encode features that are useful for frequent ICD codes only. To mitigate this risk, CoGraph devises two graph contrastive learning schemes, GSCL and GECL, that exploit the HEWE graph structures so as to encode transferable features. GSCL utilizes the intra-correlation of different sub-graphs sampled from HEWE graphs while GECL exploits the inter-correlation among HEWE graphs at different clinical stages. Experiments on the MIMIC-III benchmark dataset show that CoGraph significantly outperforms state-of-the-art methods on EHR coding, not only on frequent ICD codes, but also on rare codes, in terms of several evaluation indicators. On frequent ICD codes, GSCL and GECL improve the classification accuracy and F1 by 1.31 improvements by 2.12


page 1

page 2

page 3

page 4


Knowledge Injected Prompt Based Fine-tuning for Multi-label Few-shot ICD Coding

Automatic International Classification of Diseases (ICD) coding aims to ...

Generalized Zero-shot ICD Coding

The International Classification of Diseases (ICD) is a list of classifi...

Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review and Replicability Study

Medical coding is the task of assigning medical codes to clinical free-t...

Motivo: fast motif counting via succinct color coding and adaptive sampling

The randomized technique of color coding is behind state-of-the-art algo...

Contrastive Graph Few-Shot Learning

Prevailing deep graph learning models often suffer from label sparsity i...

KnowAugNet: Multi-Source Medical Knowledge Augmented Medication Prediction Network with Multi-Level Graph Contrastive Learning

Predicting medications is a crucial task in many intelligent healthcare ...

Auto-decoding Graphs

We present an approach to synthesizing new graph structures from empiric...

Please sign up or login with your details

Forgot password? Click here to reset