PerfSAGE: Generalized Inference Performance Predictor for Arbitrary Deep Learning Models on Edge Devices

by   Yuji Chai, et al.

The ability to accurately predict deep neural network (DNN) inference performance metrics, such as latency, power, and memory footprint, for an arbitrary DNN on a target hardware platform is essential to the design of DNN based models. This ability is critical for the (manual or automatic) design, optimization, and deployment of practical DNNs for a specific hardware deployment platform. Unfortunately, these metrics are slow to evaluate using simulators (where available) and typically require measurement on the target hardware. This work describes PerfSAGE, a novel graph neural network (GNN) that predicts inference latency, energy, and memory footprint on an arbitrary DNN TFlite graph (TFL, 2017). In contrast, previously published performance predictors can only predict latency and are restricted to pre-defined construction rules or search spaces. This paper also describes the EdgeDLPerf dataset of 134,912 DNNs randomly sampled from four task search spaces and annotated with inference performance metrics from three edge hardware platforms. Using this dataset, we train PerfSAGE and provide experimental results that demonstrate state-of-the-art prediction accuracy with a Mean Absolute Percentage Error of <5 These results: (1) Outperform previous state-of-art GNN-based predictors (Dudziak et al., 2020), (2) Accurately predict performance on accelerators (a shortfall of non-GNN-based predictors (Zhang et al., 2021)), and (3) Demonstrate predictions on arbitrary input graphs without modifications to the feature extractor.


Scaling Up Deep Neural Network Optimization for Edge Inference

Deep neural networks (DNNs) have been increasingly deployed on and integ...

AnalogNAS: A Neural Network Design Framework for Accurate Inference with Analog In-Memory Computing

The advancement of Deep Learning (DL) is driven by efficient Deep Neural...

Hardware-Aware Graph Neural Network Automated Design for Edge Computing Platforms

Graph neural networks (GNNs) have emerged as a popular strategy for hand...

Moses: Efficient Exploitation of Cross-device Transferable Features for Tensor Program Optimization

Achieving efficient execution of machine learning models has attracted s...

MaGNAS: A Mapping-Aware Graph Neural Architecture Search Framework for Heterogeneous MPSoC Deployment

Graph Neural Networks (GNNs) are becoming increasingly popular for visio...

Towards Task and Architecture-Independent Generalization Gap Predictors

Can we use deep learning to predict when deep learning works? Our result...

Throughput Maximization of DNN Inference: Batching or Multi-Tenancy?

Deployment of real-time ML services on warehouse-scale infrastructures i...

Please sign up or login with your details

Forgot password? Click here to reset