TapirXLA: Embedding Fork-Join Parallelism into the XLA Compiler in TensorFlow Using Tapir

08/29/2019
by   Tao B. Schardl, et al.
0

This work introduces TapirXLA, a replacement for TensorFlow's XLA compiler that embeds recursive fork-join parallelism into XLA's low-level representation of code. Machine-learning applications rely on efficient parallel processing to achieve performance, and they employ a variety of technologies to improve performance, including compiler technology. But compilers in machine-learning frameworks lack a deep understanding of parallelism, causing them to lose performance by missing optimizations on parallel computation. This work studies how Tapir, a compiler intermediate representation (IR) that embeds parallelism into a mainstream compiler IR, can be incorporated into a compiler for machine learning to remedy this problem. TapirXLA modifies the XLA compiler in TensorFlow to employ the Tapir/LLVM compiler to optimize low-level parallel computation. TapirXLA encodes the parallelism within high-level TensorFlow operations using Tapir's representation of fork-join parallelism. TapirXLA also exposes to the compiler implementations of linear-algebra library routines whose parallel operations are encoded using Tapir's representation. We compared the performance of TensorFlow using TapirXLA against TensorFlow using an unmodified XLA compiler. On four neural-network benchmarks, TapirXLA speeds up the parallel running time of the network by a geometric-mean multiplicative factor of 30

READ FULL TEXT
research
09/21/2022

UPIR: Toward the Design of Unified Parallel Intermediate Representation for Parallel Programming Models

The complexity of heterogeneous computing architectures, as well as the ...
research
03/08/2019

Auto-Vectorizing TensorFlow Graphs: Jacobians, Auto-Batching And Beyond

We propose a static loop vectorization optimization on top of high level...
research
11/23/2012

Theano: new features and speed improvements

Theano is a linear algebra compiler that optimizes a user's symbolically...
research
06/28/2022

Memory Safe Computations with XLA Compiler

Software packages like TensorFlow and PyTorch are designed to support li...
research
03/09/2021

DISC: A Dynamic Shape Compiler for Machine Learning Workloads

Many recent machine learning models show dynamic shape characteristics. ...
research
07/01/2021

Efficient Tree-Traversals: Reconciling Parallelism and Dense Data Representations

Recent work showed that compiling functional programs to use dense, seri...
research
01/08/2022

A Compiler Framework for Optimizing Dynamic Parallelism on GPUs

Dynamic parallelism on GPUs allows GPU threads to dynamically launch oth...

Please sign up or login with your details

Forgot password? Click here to reset