Splitability Annotations: Optimizing Black-Box Function Composition in Existing Libraries
Data movement is a major bottleneck in parallel data-intensive applications. In response to this problem, researchers have proposed new runtimes and intermediate representations (IRs) that apply optimizations such as loop fusion under existing library APIs. Even though these runtimes generally do no require changes to user code, they require intrusive changes to the library itself: often, all the library functions need to be rewritten for a new IR or virtual machine. In this paper, we propose a new abstraction called splitability annotations (SAs) that enables key data movement optimizations on black-box library functions. SAs only require that users add an annotation for existing, unmodified functions and implement a small API to split data values in the library. Together, this interface describes how to partition values that are passed among functions to enable data pipelining and automatic parallelization while respecting each library's correctness constraints. We implement SAs in a system called Mozart. Without modifying any library function, on workloads using NumPy and Pandas in Python and Intel MKL in C, Mozart provides performance competitive with intrusive solutions that require rewriting libraries in many cases, can sometimes improve performance over past systems by up to 2x, and accelerates workloads by up to 30x.
READ FULL TEXT