Automatic Generation of Benchmarks for Entity Recognition and Linking

The velocity dimension of Big Data plays an increasingly important role in processing unstructured data. Heretofore, no large-scale benchmarks were available to evaluate the performance of named entity recognition and entity linking solutions. This unavailability was due to the creation of gold standards for named entity recognition and entity linking being a time-intensive, costly and error-prone task. We hence investigate the automatic generation of benchmark texts with entity annotations for named entity recognition and linking from Linked Data. The main advantage of automatically constructed benchmarks is that they can be readily generated at any time, and are cost-effective while being guaranteed to achieve gold-standard quality. We compare the performance of 11 tools on the benchmarks we generate with their performance on 16 benchmarks that were created manually. Our results suggest that our automatic benchmark generation approach can create varied benchmarks that have characteristics similar to those of existing benchmarks. In addition, we perform a large-scale runtime evaluation of entity recognition and linking solutions for the first time in literature. Our experimental results are available at http://faturl.com/bengalexps/?open.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/14/2022

TweetNERD – End to End Entity Linking Benchmark for Tweets

Named Entity Recognition and Disambiguation (NERD) systems are foundatio...
research
03/22/2022

SU-NLP at SemEval-2022 Task 11: Complex Named Entity Recognition with Entity Linking

This paper describes the system proposed by Sabancı University Natural L...
research
05/24/2023

A Fair and In-Depth Evaluation of Existing End-to-End Entity Linking Systems

Existing evaluations of entity linking systems often say little about ho...
research
08/20/2021

SoMeSci- A 5 Star Open Data Gold Standard Knowledge Graph of Software Mentions in Scientific Articles

Knowledge about software used in scientific investigations is important ...
research
04/27/2020

Automatic Textual Evidence Mining in COVID-19 Literature

We created this EVIDENCEMINER system for automatic textual evidence mini...
research
07/17/2020

Augmented Understanding and Automated Adaptation of Curation Rules

Over the past years, there has been many efforts to curate and increase ...
research
11/03/2020

Exhaustive Entity Recognition for Coptic: Challenges and Solutions

Entity recognition provides semantic access to ancient materials in the ...

Please sign up or login with your details

Forgot password? Click here to reset