Revisiting Active Object Stores: Bringing Data Locality to the Limit With NVM

by   Alex Barceló, et al.

Object stores are widely used software stacks that achieve excellent scale-out with a well-defined interface and robust performance. However, their traditional get/put interface is unable to exploit data locality at its fullest, and limits reaching its peak performance. In particular, there is one way to improve data locality that has not yet achieved mainstream adoption: the active object store. Although there are some projects that have implemented the main idea of the active object store such as Swift's Storlets or Ceph Object Classes, the scope of these implementations is limited. We believe that there is a huge potential for active object stores in the current status quo. Hyper-converged nodes are bringing more computing capabilities to storage nodes –and viceversa. The proliferation of non-volatile memory (NVM) technology is blurring the line between system memory (fast and scarce) and block devices (slow and abundant). More and more applications need to manage a sheer amount of data (data analytics, Big Data, Machine Learning AI, etc.), demanding bigger clusters and more complex computations. All these elements are potential game changers that need to be evaluated in the scope of active object stores. In this article we propose an active object store software stack and evaluate it on an NVM-populated node. We will show how this setup is able to reduce execution times from 10 application scenarios. Our discussion will focus on the active aspect of the system as well as on the implications of the memory configuration.


Memory-Disaggregated In-Memory Object Store Framework for Big Data Applications

The concept of memory disaggregation has recently been gaining traction ...

Learnings from an Under the Hood Analysis of an Object Storage Node IO Stack

Conventional object-stores are built on top of traditional OS storage st...

Assessing the Use Cases of Persistent Memory in High-Performance Scientific Computing

As the High Performance Computing world moves towards the Exa-Scale era,...

Fast Bitmap Fit: A CPU Cache Line friendly memory allocator for single object allocations

Applications making excessive use of single-object based data structures...

BEANS - a software package for distributed Big Data analysis

BEANS software is a web based, easy to install and maintain, new tool to...

Optimally Hiding Object Sizes with Constrained Padding

Among the most challenging traffic-analysis attacks to confound are thos...

A milestone for FaaS pipelines; object storage vs VM-driven data exchange

Serverless functions provide high levels of parallelism, short startup t...

Please sign up or login with your details

Forgot password? Click here to reset