ML-EXray: Visibility into ML Deployment on the Edge

by   Hang Qiu, et al.

Benefiting from expanding cloud infrastructure, deep neural networks (DNNs) today have increasingly high performance when trained in the cloud. Researchers spend months of effort competing for an extra few percentage points of model accuracy. However, when these models are actually deployed on edge devices in practice, very often, the performance can abruptly drop over 10 obvious reasons. The key challenge is that there is not much visibility into ML inference execution on edge devices, and very little awareness of potential issues during the edge deployment process. We present ML-EXray, an end-to-end framework, which provides visibility into layer-level details of the ML execution, and helps developers analyze and debug cloud-to-edge deployment issues. More often than not, the reason for sub-optimal edge performance does not only lie in the model itself, but every operation throughout the data flow and the deployment process. Evaluations show that ML-EXray can effectively catch deployment issues, such as pre-processing bugs, quantization issues, suboptimal kernels, etc. Using ML-EXray, users need to write less than 15 lines of code to fully examine the edge deployment pipeline. Eradicating these issues, ML-EXray can correct model performance by up to 30 error-prone layers, and guide users to optimize kernel execution latency by two orders of magnitude. Code and APIs will be released as an open-source multi-lingual instrumentation library and a Python deployment validation library.


page 4

page 8


Rethinking Machine Learning Development and Deployment for Edge Devices

Machine learning (ML), especially deep learning is made possible by the ...

User-centric Composable Services: A New Generation of Personal Data Analytics

Machine Learning (ML) techniques, such as Neural Network, are widely use...

Deployment of ML Models using Kubeflow on Different Cloud Providers

This project aims to explore the process of deploying Machine learning m...

Walle: An End-to-End, General-Purpose, and Large-Scale Production System for Device-Cloud Collaborative Machine Learning

To break the bottlenecks of mainstream cloud-based machine learning (ML)...

Cost-Driven Offloading for DNN-based Applications over Cloud, Edge and End Devices

Currently, deep neural networks (DNNs) have achieved a great success in ...

Practical Insights on Incremental Learning of New Human Physical Activity on the Edge

Edge Machine Learning (Edge ML), which shifts computational intelligence...

Productive Reproducible Workflows for DNNs: A Case Study for Industrial Defect Detection

As Deep Neural Networks (DNNs) have become an increasingly ubiquitous wo...

Please sign up or login with your details

Forgot password? Click here to reset