Performance Analysis of Deep Learning Workloads on a Composable System

03/19/2021
by   Kauotar El Maghraoui, et al.
1

A composable infrastructure is defined as resources, such as compute, storage, accelerators and networking, that are shared in a pool and that can be grouped in various configurations to meet application requirements. This freedom to 'mix and match' resources dynamically allows for experimentation early in the design cycle, prior to the final architectural design or hardware implementation of a system. This design provides flexibility to serve a variety of workloads and provides a dynamic co-design platform that allows experiments and measurements in a controlled manner. For instance, key performance bottlenecks can be revealed early on in the experimentation phase thus avoiding costly and time consuming mistakes. Additionally, various system-level topologies can be evaluated when experimenting with new System on Chip (SoCs) and new accelerator types. This paper details the design of an enterprise composable infrastructure that we have implemented and made available to our partners in the IBM Research AI Hardware Center (AIHC). Our experimental evaluations on the composable system give insights into how the system works and evaluates the impact of various resource aggregations and reconfigurations on representative deep learning benchmarks.

READ FULL TEXT

page 1

page 3

page 5

page 8

page 9

research
07/08/2021

First-Generation Inference Accelerator Deployment at Facebook

In this paper, we provide a deep dive into the deployment of inference a...
research
11/08/2019

The Pitfall of Evaluating Performance on Emerging AI Accelerators

In recent years, domain-specific hardware has brought significant perfor...
research
12/10/2019

SMAUG: End-to-End Full-Stack Simulation Infrastructure for Deep Learning Workloads

In recent years, there has been tremendous advances in hardware accelera...
research
06/29/2023

Performance Analysis of DNN Inference/Training with Convolution and non-Convolution Operations

Today's performance analysis frameworks for deep learning accelerators s...
research
09/20/2020

VirtualFlow: Decoupling Deep Learning Model Execution from Underlying Hardware

State-of-the-art deep learning systems tightly couple model execution wi...
research
05/04/2018

MAESTRO: An Open-source Infrastructure for Modeling Dataflows within Deep Learning Accelerators

We present MAESTRO, a framework to describe and analyze CNN dataflows, a...
research
10/26/2020

Disaggregated Accelerator Management System for Cloud Data Centers

A conventional data center that consists of monolithic-servers is confro...

Please sign up or login with your details

Forgot password? Click here to reset