German BERT Model for Legal Named Entity Recognition

03/07/2023
by   Harshil Darji, et al.
0

The use of BERT, one of the most popular language models, has led to improvements in many Natural Language Processing (NLP) tasks. One such task is Named Entity Recognition (NER) i.e. automatic identification of named entities such as location, person, organization, etc. from a given text. It is also an important base step for many NLP tasks such as information extraction and argumentation mining. Even though there is much research done on NER using BERT and other popular language models, the same is not explored in detail when it comes to Legal NLP or Legal Tech. Legal NLP applies various NLP techniques such as sentence similarity or NER specifically on legal data. There are only a handful of models for NER tasks using BERT language models, however, none of these are aimed at legal documents in German. In this paper, we fine-tune a popular BERT language model trained on German data (German BERT) on a Legal Entity Recognition (LER) dataset. To make sure our model is not overfitting, we performed a stratified 10-fold cross-validation. The results we achieve by fine-tuning German BERT on the LER dataset outperform the BiLSTM-CRF+ model used by the authors of the same LER dataset. Finally, we make the model openly available via HuggingFace.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/19/2020

Beheshti-NER: Persian Named Entity Recognition Using BERT

Named entity recognition is a natural language processing task to recogn...
research
03/29/2020

A Dataset of German Legal Documents for Named Entity Recognition

We describe a dataset developed for Named Entity Recognition in German f...
research
12/01/2021

Wiki to Automotive: Understanding the Distribution Shift and its impact on Named Entity Recognition

While transfer learning has become a ubiquitous technique used across Na...
research
10/21/2020

German's Next Language Model

In this work we present the experiments which lead to the creation of ou...
research
04/23/2021

Optimizing small BERTs trained for German NER

Currently, the most widespread neural network architecture for training ...
research
12/05/2018

Inflection-Tolerant Ontology-Based Named Entity Recognition for Real-Time Applications

A growing number of applications users daily interact with have to opera...
research
06/29/2022

GERNERMED++: Transfer Learning in German Medical NLP

We present a statistical model for German medical natural language proce...

Please sign up or login with your details

Forgot password? Click here to reset