Tensor Relational Algebra for Machine Learning System Design

09/01/2020
by   Binhang Yuan, et al.
0

Machine learning (ML) systems have to support various tensor operations. However, such ML systems were largely developed without asking: what are the foundational abstractions necessary for building machine learning systems? We believe that proper computational and implementation abstractions will allow for the construction of self-configuring, declarative ML systems, especially when the goal is to execute tensor operations in a distributed environment, or partitioned across multiple AI accelerators (ASICs). To this end, we first introduce a tensor relational algebra (TRA), which is expressive to encode any tensor operation that can be written in the Einstein notation. We consider how TRA expressions can be re-written into an implementation algebra (IA) that enables effective implementation in a distributed environment, as well as how expressions in the IA can be optimized. Our empirical study shows that the optimized implementation provided by IA can reach or even out-perform carefully engineered HPC or ML systems for large scale tensor manipulations and ML workflows in distributed clusters.

READ FULL TEXT

page 1

page 2

page 3

page 4

07/28/2022

SpDISTAL: Compiling Distributed Sparse Tensor Computations

We introduce SpDISTAL, a compiler for sparse tensor algebra that targets...
01/02/2018

On Optimizing Operator Fusion Plans for Large-Scale Machine Learning in SystemML

Many large-scale machine learning (ML) systems allow specifying custom M...
03/08/2022

TTML: tensor trains for general supervised machine learning

This work proposes a novel general-purpose estimator for supervised mach...
10/09/2020

A Tensor Compiler for Unified Machine Learning Prediction Serving

Machine Learning (ML) adoption in the enterprise requires simpler and mo...
12/03/2015

MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems

MXNet is a multi-language machine learning (ML) library to ease the deve...
10/29/2014

High-Performance Distributed ML at Scale through Parameter Server Consistency Models

As Machine Learning (ML) applications increase in data size and model co...

Please sign up or login with your details

Forgot password? Click here to reset