Biomedical Interpretable Entity Representations

06/17/2021
by   Diego Garcia-Olano, et al.
0

Pre-trained language models induce dense entity representations that offer strong performance on entity-centric NLP tasks, but such representations are not immediately interpretable. This can be a barrier to model uptake in important domains such as biomedicine. There has been recent work on general interpretable representation learning (Onoe and Durrett, 2020), but these domain-agnostic representations do not readily transfer to the important domain of biomedicine. In this paper, we create a new entity type system and training set from a large corpus of biomedical texts by mapping entities to concepts in a medical ontology, and from these to Wikipedia pages whose categories are our types. From this mapping we derive Biomedical Interpretable Entity Representations(BIERs), in which dimensions correspond to fine-grained entity types, and values are predicted probabilities that a given entity is of the corresponding type. We propose a novel method that exploits BIER's final sparse and intermediate dense representations to facilitate model and entity type debugging. We show that BIERs achieve strong performance in biomedical tasks including named entity disambiguation and entity label classification, and we provide error analysis to highlight the utility of their interpretability, particularly in low-supervision settings. Finally, we provide our induced 68K biomedical type system, the corresponding 37 million triples of derived data used to train BIER models and our best performing model.

READ FULL TEXT

page 2

page 14

research
12/03/2022

Intermediate Entity-based Sparse Interpretable Representation Learning

Interpretable entity representations (IERs) are sparse embeddings that a...
research
04/30/2020

Interpretable Entity Representations through Large-Scale Typing

In standard methodology for natural language processing, entities in tex...
research
06/30/2023

Biomedical Language Models are Robust to Sub-optimal Tokenization

As opposed to general English, many concepts in biomedical terminology h...
research
07/20/2023

UMLS-KGI-BERT: Data-Centric Knowledge Integration in Transformers for Biomedical Entity Recognition

Pre-trained transformer language models (LMs) have in recent years becom...
research
05/11/2023

Detecting Idiomatic Multiword Expressions in Clinical Terminology using Definition-Based Representation Learning

This paper shines a light on the potential of definition-based semantic ...
research
04/30/2019

Fine-grained Entity Recognition with Reduced False Negatives and Large Type Coverage

Fine-grained Entity Recognition (FgER) is the task of detecting and clas...
research
03/22/2017

Supervised Typing of Big Graphs using Semantic Embeddings

We propose a supervised algorithm for generating type embeddings in the ...

Please sign up or login with your details

Forgot password? Click here to reset