Cloudprofiler: TSC-based inter-node profiling and high-throughput data ingestion for cloud streaming workloads

by   Shinhyung Yang, et al.

To conduct real-time analytics computations, big data stream processing engines are required to process unbounded data streams at millions of events per second. However, current streaming engines exhibit low throughput and high tuple processing latency. Performance engineering is complicated by the fact that streaming engines constitute complex distributed systems consisting of multiple nodes in the cloud. A profiling technique is required that is capable of measuring time durations at high accuracy across nodes. Standard clock synchronization techniques such as the network time protocol (NTP) are limited to millisecond accuracy, and hence cannot be used. We propose a profiling technique that relates the time-stamp counters (TSCs) of nodes to measure the duration of events in a streaming framework. The precision of the TSC relation determines the accuracy of the measured duration. The TSC relation is conducted in quiescent periods of the network to achieve accuracy in the tens of microseconds. We propose a throughput-controlled data generator to reliably determine the sustainable throughput of a streaming engine. To facilitate high-throughput data ingestion, we propose a concurrent object factory that moves the deserialization overhead of incoming data tuples off the critical path of the streaming framework. The evaluation of the proposed techniques within the Apache Storm streaming framework on the Google Compute Engine public cloud shows that data ingestion increases from 700 k to 4.68 M tuples per second, and that time durations can be profiled at a measurement accuracy of 92 μs, which is three orders of magnitude higher than the accuracy of NTP, and one order of magnitude higher than prior work.


page 2

page 7

page 11

page 12


StreamBox-TZ: A Secure IoT Analytics Engine at the Edge

We present StreamBox-TZ, a stream analytics engine for an edge platform....

TiLT: A Time-Centric Approach for Stream Query Optimization and Parallelization

Stream processing engines (SPEs) are widely used for large scale streami...

Colocating Real-time Storage and Processing: An Analysis of Pull-based versus Push-based Streaming

Real-time Big Data architectures evolved into specialized layers for han...

Stream Processing With Dependency-Guided Synchronization

Real-time data processing applications with low latency requirements hav...

A New Frontier for Pull-Based Graph Processing

The trade-off between pull-based and push-based graph processing engines...

Distributed Real-Time Data Stream Analysis for CTA

Once completed, the Cherenkov Telescope Array (CTA) will be able to map ...

AIR – A Light-Weight Yet High-Performance Dataflow Engine based on Asynchronous Iterative Routing

Distributed Stream Processing Systems (DSPSs) are among the currently mo...

Please sign up or login with your details

Forgot password? Click here to reset