SensiX++: Bringing MLOPs and Multi-tenant Model Serving to Sensory Edge Devices

09/08/2021
by   Chulhong Min, et al.
0

We present SensiX++ - a multi-tenant runtime for adaptive model execution with integrated MLOps on edge devices, e.g., a camera, a microphone, or IoT sensors. SensiX++ operates on two fundamental principles - highly modular componentisation to externalise data operations with clear abstractions and document-centric manifestation for system-wide orchestration. First, a data coordinator manages the lifecycle of sensors and serves models with correct data through automated transformations. Next, a resource-aware model server executes multiple models in isolation through model abstraction, pipeline automation and feature sharing. An adaptive scheduler then orchestrates the best-effort executions of multiple models across heterogeneous accelerators, balancing latency and throughput. Finally, microservices with REST APIs serve synthesised model predictions, system statistics, and continuous deployment. Collectively, these components enable SensiX++ to serve multiple models efficiently with fine-grained control on edge devices while minimising data operation redundancy, managing data and device heterogeneity, reducing resource contention and removing manual MLOps. We benchmark SensiX++ with ten different vision and acoustics models across various multi-tenant configurations on different edge accelerators (Jetson AGX and Coral TPU) designed for sensory devices. We report on the overall throughput and quantified benefits of various automation components of SensiX++ and demonstrate its efficacy to significantly reduce operational complexity and lower the effort to deploy, upgrade, reconfigure and serve embedded models on edge devices.

READ FULL TEXT

page 3

page 11

research
12/04/2020

SensiX: A Platform for Collaborative Machine Learning on the Edge

The emergence of multiple sensory devices on or near a human body is unc...
research
01/18/2022

Model-driven Cluster Resource Management for AI Workloads in Edge Clouds

Since emerging edge applications such as Internet of Things (IoT) analyt...
research
07/20/2022

AutoDiCE: Fully Automated Distributed CNN Inference at the Edge

Deep Learning approaches based on Convolutional Neural Networks (CNNs) a...
research
02/09/2018

Running Distributed and Dynamic IoT Choreographies

IoT systems are growing larger and larger and are becoming suitable for ...
research
02/20/2021

An Evaluation of Edge TPU Accelerators for Convolutional Neural Networks

Edge TPUs are a domain of accelerators for low-power, edge devices and a...
research
06/22/2023

MultiTASC: A Multi-Tenancy-Aware Scheduler for Cascaded DNN Inference at the Consumer Edge

Cascade systems comprise a two-model sequence, with a lightweight model ...
research
05/05/2023

MOSAIC: Spatially-Multiplexed Edge AI Optimization over Multiple Concurrent Video Sensing Streams

Sustaining high fidelity and high throughput of perception tasks over vi...

Please sign up or login with your details

Forgot password? Click here to reset