CAMEL: Co-Designing AI Models and Embedded DRAMs for Efficient On-Device Learning

by   Sai Qian Zhang, et al.

The emergence of the Internet of Things (IoT) has resulted in a remarkable amount of data generated on edge devices, which are often processed using AI algorithms. On-device learning enables edge platforms to continually adapt the AI models to user personal data and further allows for a better service quality. However, AI training on resource-limited devices is extremely difficult because of the intensive computing workload and the significant amount of on-chip memory consumption exacted by deep neural networks (DNNs). To mitigate this, we propose to use embedded dynamic random-access memory (eDRAM) as the main storage medium of training data. Compared with static random-access memory (SRAM), eDRAM introduces more than 2× improvement on storage density, enabling reduced off-chip memory traffic. However, to keep the stored data intact, eDRAM is required to perform the power-hungry data refresh operations. eDRAM refresh can be eliminated if the data is stored for a period of time that is shorter than the eDRAM retention time. To achieve this, we design a novel reversible DNN architecture that enables a significantly reduced data lifetime during the training process and removes the need for eDRAM refresh. We further design an efficient on-device training engine, termed CAMEL, that uses eDRAM as the main on-chip memory. CAMEL enables the intermediate results during training to fit fully in on-chip eDRAM arrays and completely eliminates the off-chip DRAM traffic during the training process. We evaluate our CAMEL system on multiple DNNs with different datasets, demonstrating a more than 3× saving on total DNN training energy consumption than the other baselines, while achieving a similar (even better) performance in validation accuracy.


page 1

page 3


Enabling Deep Learning on Edge Devices

Deep neural networks (DNNs) have succeeded in many different perception ...

EDEN: Enabling Energy-Efficient, High-Performance Deep Neural Network Inference Using Approximate DRAM

The effectiveness of deep neural networks (DNN) in vision, speech, and l...

Impact of On-Chip Interconnect on In-Memory Acceleration of Deep Neural Networks

With the widespread use of Deep Neural Networks (DNNs), machine learning...

RCT: Resource Constrained Training for Edge AI

Neural networks training on edge terminals is essential for edge AI comp...

Evaluation of STT-MRAM as a Scratchpad for Training in ML Accelerators

Progress in artificial intelligence and machine learning over the past d...

Visual Wake Words Dataset

The emergence of Internet of Things (IoT) applications requires intellig...

Data Isotopes for Data Provenance in DNNs

Today, creators of data-hungry deep neural networks (DNNs) scour the Int...

Please sign up or login with your details

Forgot password? Click here to reset