Benchmarking Blocking Algorithms for Web Entities

05/19/2020
by   Vasilis Efthymiou, et al.
0

An increasing number of entities are described by interlinked data rather than documents on the Web. Entity Resolution (ER) aims to identify descriptions of the same real-world entity within one or across knowledge bases in the Web of data. To reduce the required number of pairwise comparisons among descriptions, ER methods typically perform a pre-processing step, called blocking, which places similar entity descriptions into blocks and thus only compare descriptions within the same block. We experimentally evaluate several blocking methods proposed for the Web of data using real datasets, whose characteristics significantly impact their effectiveness and efficiency. The proposed experimental evaluation framework allows us to better understand the characteristics of the missed matching entity descriptions and contrast them with ground truth obtained from different kinds of relatedness links.

READ FULL TEXT
research
05/15/2019

MinoanER: Schema-Agnostic, Non-Iterative, Massively Parallel Resolution of Web Entities

Entity Resolution (ER) aims to identify different descriptions in variou...
research
03/05/2021

Pilot Investigation for a Comprehensive Taxonomy of Autonomous Entities

This paper documents an exploratory pilot study to define the term Auton...
research
02/25/2022

How to reduce the search space of Entity Resolution: with Blocking or Nearest Neighbor search?

Entity Resolution suffers from quadratic time complexity. To increase it...
research
05/15/2019

End-to-End Entity Resolution for Big Data: A Survey

One of the most important tasks for improving data quality and the relia...
research
05/31/2018

Skyblocking for Entity Resolution

In this paper, for the first time, we introduce the concept of skyblocki...
research
08/23/2023

Tau Prolog: A Prolog interpreter for the Web

Tau Prolog is a client-side Prolog interpreter fully implemented in Java...
research
05/15/2019

A Survey of Blocking and Filtering Techniques for Entity Resolution

Efficiency techniques are an integral part of Entity Resolution, since i...

Please sign up or login with your details

Forgot password? Click here to reset