MapSDI: A Scaled-up Semantic Data Integration Framework for Knowledge Graph Creation

09/03/2019
by   Samaneh Jozashoori, et al.
0

Semantic web technologies have significantly contributed with effective solutions for the problems of data integration and knowledge graph creation. However, with the rapid growth of big data in diverse domains, different interoperability issues still demand to be addressed, being scalability one of the main challenges. In this paper, we address the problem of knowledge graph creation at scale and provide MapSDI, a mapping rule-based framework for optimizing semantic data integration into knowledge graphs. MapSDI allows for the semantic enrichment of large-sized, heterogeneous, and potentially low-quality data efficiently. The input of MapSDI is a set of data sources and mapping rules being generated by a mapping language such as RML. First, MapSDI pre-processes the sources based on semantic information extracted from mapping rules, by performing basic database operators; it projects out required attributes, eliminates duplicates, and selects relevant entries. All these operators are defined based on the knowledge encoded by the mapping rules which will be then used by the semantification engine (or RDFizer) to produce a knowledge graph. We have empirically studied the impact of MapSDI on existing RDFizers, and observed that knowledge graph creation time can be reduced on average in one order of magnitude. It is also shown, theoretically, that the sources and rules transformations provided by MapSDI are data-lossless.

READ FULL TEXT

page 9

page 11

page 12

research
11/05/2018

Data Integration for Supporting Biomedical Knowledge Graph Creation at Large-Scale

In recent years, following FAIR and open data principles, the number of ...
research
08/31/2020

FunMap: Efficient Execution of Functional Mappings for Knowledge Graph Creation

Data has exponentially grown in the last years, and knowledge graphs con...
research
08/17/2020

SDM-RDFizer: An RML Interpreter for the Efficient Creation of RDF Knowledge Graphs

In recent years, the amount of data has increased exponentially, and kno...
research
01/24/2022

Scaling Up Knowledge Graph Creation to Large and Heterogeneous Data Sources

RDF knowledge graphs (KG) are powerful data structures to represent fact...
research
08/03/2020

Knowledge Translation: Extended Technical Report

We introduce Kensho, a tool for generating mapping rules between two Kno...
research
10/26/2022

Dragoman: Efficiently Evaluating Declarative Mapping Languages over Frameworks for Knowledge Graph Creation

In recent years, there have been valuable efforts and contributions to m...
research
10/26/2022

ProVe: A Pipeline for Automated Provenance Verification of Knowledge Graphs against Textual Sources

Knowledge Graphs are repositories of information that gather data from a...

Please sign up or login with your details

Forgot password? Click here to reset