Learning Models over Relational Data: A Brief Tutorial

11/15/2019
by   Maximilian Schleich, et al.
0

This tutorial overviews the state of the art in learning models over relational databases and makes the case for a first-principles approach that exploits recent developments in database research. The input to learning classification and regression models is a training dataset defined by feature extraction queries over relational databases. The mainstream approach to learning over relational data is to materialize the training dataset, export it out of the database, and then learn over it using a statistical package. This approach can be expensive as it requires the materialization of the training dataset. An alternative approach is to cast the machine learning problem as a database problem by transforming the data-intensive component of the learning task into a batch of aggregates over the feature extraction query and by computing this batch directly over the input database. The tutorial highlights a variety of techniques developed by the database theory and systems communities to improve the performance of the learning task. They rely on structural properties of the relational data and of the feature extraction query, including algebraic (semi-ring), combinatorial (hypertree width), statistical (sampling), or geometric (distance) structure. They also rely on factorized computation, code specialization, query compilation, and parallelization.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/10/2020

Multi-layer Optimizations for End-to-End Data Analytics

We consider the problem of training machine learning models over multi-r...
research
08/18/2020

The Relational Data Borg is Learning

This paper overviews an approach that addresses machine learning over re...
research
10/11/2019

Rk-means: Fast Clustering for Relational Data

Conventional machine learning algorithms cannot be applied until a data ...
research
07/29/2021

Machine Learning over Static and Dynamic Relational Data

This tutorial overviews principles behind recent works on training and m...
research
03/19/2021

Connecting Images through Time and Sources: Introducing Low-data, Heterogeneous Instance Retrieval

With impressive results in applications relying on feature learning, dee...
research
09/20/2023

Relational Expressions for Data Transformation and Computation

Separate programming models for data transformation (declarative) and co...
research
01/25/2022

Serving Deep Learning Models with Deduplication from Relational Databases

There are significant benefits to serve deep learning models from relati...

Please sign up or login with your details

Forgot password? Click here to reset