Invalidation-Based Protocols for Replicated Datastores

by   Antonios Katsarakis, et al.

Distributed in-memory datastores underpin cloud applications that run within a datacenter and demand high performance, strong consistency, and availability. A key feature of datastores is data replication. The data are replicated across servers because a single server often cannot handle the request load. Replication is also necessary to guarantee that a server or link failure does not render a portion of the dataset inaccessible. A replication protocol is responsible for ensuring strong consistency between the replicas of a datastore, even when faults occur, by determining the actions necessary to access and manipulate the data. Consequently, a replication protocol also drives the datastore's performance. Existing strongly consistent replication protocols deliver fault tolerance but fall short in terms of performance. Meanwhile, the opposite occurs in the world of multiprocessors, where data are replicated across the private caches of different cores. The multiprocessor regime uses invalidations to afford strongly consistent replication with high performance but neglects fault tolerance. Although handling failures in the datacenter is critical for data availability, we observe that the common operation is fault-free and far exceeds the operation during faults. In other words, the common operating environment inside a datacenter closely resembles that of a multiprocessor. Based on this insight, we draw inspiration from the multiprocessor for high-performance, strongly consistent replication in the datacenter. The primary contribution of this thesis is in adapting invalidating protocols to the nuances of replicated datastores, which include skewed data accesses, fault tolerance, and distributed transactions.


Hermes: a Fast, Fault-Tolerant and Linearizable Replication Protocol

Today's datacenter applications are underpinned by datastores that are r...

Zeus: Locality-aware Distributed Transactions

State-of-the-art distributed in-memory datastores (FaRM, FaSST, DrTM) pr...

Applying consensus and replication securely with FLAQR

Availability is crucial to the security of distributed systems, but guar...

Resilient Cloud-based Replication with Low Latency

Existing approaches to tolerate Byzantine faults in geo-replicated envir...

Rabia: Simplifying State-Machine Replication Through Randomization

We introduce Rabia, a simple and high performance framework for implemen...

Consistency models in distributed systems: A survey on definitions, disciplines, challenges and applications

The replication mechanism resolves some challenges with big data such as...

DXRAM's Fault-Tolerance Mechanisms Meet High Speed I/O Devices

In-memory key-value stores provide consistent low-latency access to all ...

Please sign up or login with your details

Forgot password? Click here to reset