Towards reliable named entity recognition in the biomedical domain

01/15/2020
by   John Giorgi, et al.
0

Motivation Automatic biomedical named entity recognition (BioNER) is a key task in biomedical information extraction. For some time, state-of-the-art BioNER has been dominated by machine learning methods, particularly conditional random fields (CRFs), with a recent focus on deep learning. However, recent work has suggested that the high performance of CRFs for BioNER may not generalize to corpora other than the one it was trained on. In our analysis, we find that a popular deep learning-based approach to BioNER, known as bidirectional long short-term memory network-conditional random field (BiLSTM-CRF), is correspondingly poor at generalizing. To address this, we evaluate three modifications of BiLSTM-CRF for BioNER to improve generalization: improved regularization via variational dropout, transfer learning and multi-task learning. Results We measure the effect that each strategy has when training/testing on the same corpus (‘in-corpus’ performance) and when training on one corpus and evaluating on another (‘out-of-corpus’ performance), our measure of the model’s ability to generalize. We found that variational dropout improves out-of-corpus performance by an average of 4.62%, transfer learning by 6.48% and multi-task learning by 8.42%. The maximal increase we identified combines multi-task learning and variational dropout, which boosts out-of-corpus performance by 10.75%. Furthermore, we make available a new open-source tool, called Saber that implements our best BioNER models. Availability and implementation Source code for our biomedical IE tool is available at https://github.com/BaderLab/saber. Corpora and other resources used in this study are available at https://github.com/BaderLab/Towards-reliable-BioNER.

READ FULL TEXT
research
01/15/2020

Transfer learning for biomedical named entity recognition with neural networks.

Motivation The explosive increase of biomedical literature has made i...
research
11/01/2020

Analyzing the Effect of Multi-task Learning for Biomedical Named Entity Recognition

Developing high-performing systems for detecting biomedical named entiti...
research
01/05/2021

PhoNLP: A joint multi-task learning model for Vietnamese part-of-speech tagging, named entity recognition and dependency parsing

We present the first multi-task learning model – named PhoNLP – for join...
research
05/06/2020

An Empirical Study of Multi-Task Learning on BERT for Biomedical Text Mining

Multi-task learning (MTL) has achieved remarkable success in natural lan...
research
11/30/2022

AIONER: All-in-one scheme-based biomedical named entity recognition using deep learning

Biomedical named entity recognition (BioNER) seeks to automatically reco...
research
05/31/2022

FinBERT-MRC: financial named entity recognition using BERT under the machine reading comprehension paradigm

Financial named entity recognition (FinNER) from literature is a challen...
research
07/10/2020

Neural Knowledge Extraction From Cloud Service Incidents

In the last decade, two paradigm shifts have reshaped the software indus...

Please sign up or login with your details

Forgot password? Click here to reset