Plato: Approximate Analytics over Compressed Time Series with Tight Deterministic Error Guarantees

by   Etienne Boursier, et al.

Plato provides sound and tight deterministic error guarantees for approximate analytics over compressed time series. Plato supports expressions that are compositions of the (commonly used in time series analytics) linear algebra operators over vectors, along with arithmetic operators. Such analytics can express common statistics (such as correlation and cross-correlation) that may combine multiple time series. The time series are segmented either by fixed-length segmentation or by (more effective) variable-length segmentation. Each segment (i) is compressed by an estimation function that approximates the actual values and is coming from a user-chosen estimation function family, and (ii) is associated with one to three (depending on the case) precomputed error measures. Then Plato is able to provide tight deterministic error guarantees for the analytics over the compressed time series. This work identifies two broad estimation function family groups. The Vector Space (VS) family and the presently defined Linear Scalable Family (LSF) lead to theoretically and practically high-quality guarantees, even for queries that combine multiple time series that have been independently compressed. Well-known function families (e.g., the polynomial function family) belong to LSF. The theoretical aspect of "high quality" is crisply captured by the Amplitude Independence (AI) property: An AI guarantee does not depend on the amplitude of the involved time series, even when we combine multiple time series. The experiments on four real-life datasets validated the importance of the Amplitude Independent (AI) error guarantees: When the novel AI guarantees were applicable, the guarantees could ensure that the approximate query results were very close (typically 1


A Shift Test for Independence in Generic Time Series

We describe a family of conservative statistical tests for independence ...

Time series clustering based on the characterisation of segment typologies

Time series clustering is the process of grouping time series with respe...

Efficient Approximate Query Answering over Sensor Data with Deterministic Error Guarantees

With the recent proliferation of sensor data, there is an increasing nee...

SummerTime: Variable-length Time SeriesSummarization with Applications to PhysicalActivity Analysis

SummerTime seeks to summarize globally time series signals and provides ...

Scalable Model-Based Management of Correlated Dimensional Time Series in ModelarDB

To monitor critical infrastructure, high quality sensors sampled at a hi...

Training Robust Deep Models for Time-Series Domain: Novel Algorithms and Theoretical Analysis

Despite the success of deep neural networks (DNNs) for real-world applic...

Please sign up or login with your details

Forgot password? Click here to reset