Revisiting Active Object Stores: Bringing Data Locality to the Limit With NVM
Object stores are widely used software stacks that achieve excellent scale-out with a well-defined interface and robust performance. However, their traditional get/put interface is unable to exploit data locality at its fullest, and limits reaching its peak performance. In particular, there is one way to improve data locality that has not yet achieved mainstream adoption: the active object store. Although there are some projects that have implemented the main idea of the active object store such as Swift's Storlets or Ceph Object Classes, the scope of these implementations is limited. We believe that there is a huge potential for active object stores in the current status quo. Hyper-converged nodes are bringing more computing capabilities to storage nodes –and viceversa. The proliferation of non-volatile memory (NVM) technology is blurring the line between system memory (fast and scarce) and block devices (slow and abundant). More and more applications need to manage a sheer amount of data (data analytics, Big Data, Machine Learning AI, etc.), demanding bigger clusters and more complex computations. All these elements are potential game changers that need to be evaluated in the scope of active object stores. In this article we propose an active object store software stack and evaluate it on an NVM-populated node. We will show how this setup is able to reduce execution times from 10 application scenarios. Our discussion will focus on the active aspect of the system as well as on the implications of the memory configuration.
READ FULL TEXT