SOL: Effortless Device Support for AI Frameworks without Source Code Changes

by   Nicolas Weber, et al.

Modern high performance computing clusters heavily rely on accelerators to overcome the limited compute power of CPUs. These supercomputers run various applications from different domains such as simulations, numerical applications or artificial intelligence (AI). As a result, vendors need to be able to efficiently run a wide variety of workloads on their hardware. In the AI domain this is in particular exacerbated by the existence of a number of popular frameworks (e.g, PyTorch, TensorFlow, etc.) that have no common code base, and can vary in functionality. The code of these frameworks evolves quickly, making it expensive to keep up with all changes and potentially forcing developers to go through constant rounds of upstreaming. In this paper we explore how to provide hardware support in AI frameworks without changing the framework's source code in order to minimize maintenance overhead. We introduce SOL, an AI acceleration middleware that provides a hardware abstraction layer that allows us to transparently support heterogeneous hardware. As a proof of concept, we implemented SOL for PyTorch with three backends: CPUs, GPUs and vector processors.


page 1

page 3

page 8


SOL: Reducing the Maintenance Overhead for Integrating Hardware Support into AI Frameworks

The increased interest in Artificial Intelligence (AI) raised the need f...

Hardless: A Generalized Serverless Compute Architecture for Hardware Processing Accelerators

The increasing use of hardware processing accelerators tailored for spec...

Characterizing Deep Learning Training Workloads on Alibaba-PAI

Modern deep learning models have been exploited in various domains, incl...

BrainSlug: Transparent Acceleration of Deep Learning Through Depth-First Parallelism

Neural network frameworks such as PyTorch and TensorFlow are the workhor...

Towards hardware acceleration for parton densities estimation

In this proceedings we describe the computational challenges associated ...

DRAGON (Differentiable Graph Execution) : A suite of Hardware Simulation and Optimization tools for Modern AI/Non-AI Workloads

We introduce DRAGON, an open-source, fast and explainable hardware simul...

Please sign up or login with your details

Forgot password? Click here to reset