KML: Using Machine Learning to Improve Storage Systems

11/22/2021
by   Ibrahim Umit Akgun, et al.
9

Operating systems include many heuristic algorithms designed to improve overall storage performance and throughput. Because such heuristics cannot work well for all conditions and workloads, system designers resorted to exposing numerous tunable parameters to users – thus burdening users with continually optimizing their own storage systems and applications. Storage systems are usually responsible for most latency in I/O-heavy applications, so even a small latency improvement can be significant. Machine learning (ML) techniques promise to learn patterns, generalize from them, and enable optimal solutions that adapt to changing workloads. We propose that ML solutions become a first-class component in OSs and replace manual heuristics to optimize storage systems dynamically. In this paper, we describe our proposed ML architecture, called KML. We developed a prototype KML architecture and applied it to two case studies: optimizing readahead and NFS read-size values. Our experiments show that KML consumes less than 4KB of dynamic kernel memory, has a CPU overhead smaller than 0.2 throughput by as much as 2.3x and 15x for two case studies – even for complex, never-seen-before, concurrently running mixed workloads on different storage devices.

READ FULL TEXT

page 3

page 9

page 10

page 11

research
01/27/2023

A Learned Cache Eviction Framework with Minimal Overhead

Recent work shows the effectiveness of Machine Learning (ML) to reduce c...
research
01/22/2019

Adapting The Secretary Hiring Problem for Optimal Hot-Cold Tier Placement under Top-K Workloads

Top-K queries are an established heuristic in information retrieval. Thi...
research
02/04/2013

RevDedup: A Reverse Deduplication Storage System Optimized for Reads to Latest Backups

Scaling up the backup storage for an ever-increasing volume of virtual m...
research
03/21/2022

LQoCo: Learning to Optimize Cache Capacity Overloading in Storage Systems

Cache plays an important role to maintain high and stable performance (i...
research
08/16/2021

AIRCHITECT: Learning Custom Architecture Design and Mapping Space

Design space exploration is an important but costly step involved in the...
research
04/25/2021

RDMAbox : Optimizing RDMA for Memory Intensive Workloads

We present RDMAbox, a set of low level RDMA opti-mizations that provide ...
research
01/10/2023

Quantitative Verification of Scheduling Heuristics

Computer systems use many scheduling heuristics to allocate resources. U...

Please sign up or login with your details

Forgot password? Click here to reset