Near-Data Processing for Differentiable Machine Learning Models

by   Hyeokjun Choe, et al.
Seoul National University

Near-data processing (NDP) refers to augmenting memory or storage with processing power. Despite its potential for acceleration computing and reducing power requirements, only limited progress has been made in popularizing NDP for various reasons. Recently, two major changes have occurred that have ignited renewed interest and caused a resurgence of NDP. The first is the success of machine learning (ML), which often demands a great deal of computation for training, requiring frequent transfers of big data. The second is the popularity of NAND flash-based solid-state drives (SSDs) containing multicore processors that can accommodate extra computation for data processing. In this paper, we evaluate the potential of NDP for ML using a new SSD platform that allows us to simulate instorage processing (ISP) of ML workloads. Our platform (named ISP-ML) is a full-fledged simulator of a realistic multi-channel SSD that can execute various ML algorithms using data stored in the SSD. To conduct a thorough performance analysis and an in-depth comparison with alternative techniques, we focus on a specific algorithm: stochastic gradient descent (SGD), which is the de facto standard for training differentiable models such as logistic regression and neural networks. We implement and compare three SGD variants (synchronous, Downpour, and elastic averaging) using ISP-ML, exploiting the multiple NAND channels to parallelize SGD. In addition, we compare the performance of ISP and that of conventional in-host processing, revealing the advantages of ISP. Based on the advantages and limitations identified through our experiments, we further discuss directions for future research on ISP for accelerating ML.


page 1

page 5

page 7

page 8

page 10

page 11


A case for disaggregation of ML data processing

Machine Learning (ML) computation requires feeding input data for the mo...

On-Disk Data Processing: Issues and Future Directions

In this paper, we present a survey of "on-disk" data processing (ODDP). ...

A Survey of Near-Data Processing Architectures for Neural Networks

Data-intensive workloads and applications, such as machine learning (ML)...

Machine Learning Tips and Tricks for Power Line Communications

A great deal of attention has been recently given to Machine Learning (M...

Accelerating Machine Learning Queries with Linear Algebra Query Processing

The rapid growth of large-scale machine learning (ML) models has led num...

Stochastic Gradient Descent without Full Data Shuffle

Stochastic gradient descent (SGD) is the cornerstone of modern machine l...

Dual Optimization for Kolmogorov Model Learning Using Enhanced Gradient Descent

Data representation techniques have made a substantial contribution to a...

Please sign up or login with your details

Forgot password? Click here to reset