High-Level ETL for Semantic Data Warehouses—Full Version

by   Rudra Pratap Deb Nath, et al.

The popularity of the Semantic Web (SW) encourages organizations to organize and publish semantic data using the RDF model. This growth poses new requirements to Business Intelligence (BI) technologies to enable On-Line Analytical Processing (OLAP)-like analysis over semantic data. The incorporation of semantic data into a Data Warehouse (DW) is not supported by the traditional Extract-Transform-Load (ETL) tools because they do not consider semantic issues in the integration process. In this paper, we propose a layer-based integration process and a set of high-level RDF-based ETL constructs required to define, map, extract, process, transform, integrate, update, and load (multidimensional) semantic data. Different to other ETL tools, we automate the ETL data flows by creating metadata at the schema level. Therefore, it relieves ETL developers from the burden of manual mapping at the ETL operation level. We create a prototype, named Semantic ETL Construct (SETLCONSTRUCT), based on the innovative ETL constructs proposed here. To evaluate SETLCONSTRUCT, we create a multidimensional semantic DW by integrating a Danish Business dataset and an EU Subsidy dataset using it and compare it with the previous programmable framework SETLPROG in terms of productivity, development time and performance. The evaluation shows that 1) SETLCONSTRUCT uses 92 (the extension of SETLCONSTRUCT for generating ETL execution flow automatically) further reduces the Number of Used Concepts (NOUC) by another 25 compared to SETLPROG, and is cut by another 27 SETLCONSTRUCT is scalable and has similar performance compared to SETLPROG.


page 1

page 2

page 3

page 4


Multidimensional Enrichment of Spatial RDF Data for SOLAP – Full Version

Large volumes of spatial data and multidimensional data are being publis...

BI-REC: Guided Data Analysis for Conversational Business Intelligence

Conversational interfaces to Business Intelligence (BI) applications ena...

Building an Effective Data Warehousing for Financial Sector

This article presents the implementation process of a Data Warehouse and...

A Machine Learning Based Analytical Framework for Semantic Annotation Requirements

The Semantic Web is an extension of the current web in which information...

Building and Querying Semantic Layers for Web Archives (Extended Version)

Web archiving is the process of collecting portions of the Web to ensure...

Automatic Integration Issues of Tabular Data for On-Line Analysis Processing

Companies and individuals produce numerous tabular data. The objective o...

A New Framework to Adopt Multidimensional Databases for Organizational Information System Strategies

As information becomes increasingly sizable for organizations to maintai...

Please sign up or login with your details

Forgot password? Click here to reset