Improving Chemical Named Entity Recognition in Patents with Contextualized Word Embeddings

07/05/2019
by   Zenan Zhai, et al.
0

Chemical patents are an important resource for chemical information. However, few chemical Named Entity Recognition (NER) systems have been evaluated on patent documents, due in part to their structural and linguistic complexity. In this paper, we explore the NER performance of a BiLSTM-CRF model utilising pre-trained word embeddings, character-level word representations and contextualized ELMo word representations for chemical patents. We compare word embeddings pre-trained on biomedical and chemical patent corpora. The effect of tokenizers optimized for the chemical domain on NER performance in chemical patents is also explored. The results on two patent corpora show that contextualized word representations generated from ELMo substantially improve chemical NER performance w.r.t. the current state-of-the-art. We also show that domain-specific resources such as word embeddings trained on chemical patents and chemical-specific tokenizers have a positive impact on NER performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/25/2018

Comparing CNN and LSTM character-level embeddings in BiLSTM-CRF models for chemical and disease named entity recognition

We compare the use of LSTM-based and CNN-based character-level word embe...
research
11/07/2018

microNER: A Micro-Service for German Named Entity Recognition based on BiLSTM-CRF

For named entity recognition (NER), bidirectional recurrent neural netwo...
research
09/23/2019

GNTeam at 2018 n2c2: Feature-augmented BiLSTM-CRF for drug-related entity recognition in hospital discharge summaries

Monitoring the administration of drugs and adverse drug reactions are ke...
research
10/01/2019

Improved Word Sense Disambiguation Using Pre-Trained Contextualized Word Representations

Contextualized word representations are able to give different represent...
research
01/05/2020

Computationally Efficient NER Taggers with Combined Embeddings and Constrained Decoding

Current State-of-the-Art models in Named Entity Recognition (NER) are ne...
research
12/24/2022

A Comprehensive Study of Gender Bias in Chemical Named Entity Recognition Models

Objective. Chemical named entity recognition (NER) models have the poten...
research
12/01/2016

Domain Adaptation for Named Entity Recognition in Online Media with Word Embeddings

Content on the Internet is heterogeneous and arises from various domains...

Please sign up or login with your details

Forgot password? Click here to reset