Dominant Dataset Selection Algorithms for Time-Series Data Based on Linear Transformation

by   Yi Wu, et al.

With the explosive growth of time-series data, the scale of time-series data has already exceeds the conventional computation and storage capabilities in many applications. On the other hand, the information carried by time-series data has high redundancy due to the strong correlation between time-series data. In this paper, we propose the new dominant dataset selection algorithms to extract the dataset that is only a small dataset but can represent the kernel information carried by time-series data with the error rate less than ϵ, where ϵ can be arbitrarily small. We prove that the selection problem of the dominant dataset is an NP-complete problem. The affine transformation model is introduced to define the linear transformation function to ensure the selection function of dominant dataset with the constant time complexity O(1). Furthermore, the scanning selection algorithm with the time complexity O(n2) and the greedy selection algorithm with the time complexity O(n3) are respectively proposed to extract the dominant dataset based on the linear correlation between time-series data. The proposed algorithms are evaluated on the real electric power consumption data of a city in China. The experimental results show that the proposed algorithms not only reduce the size of kernel dataset but ensure the time-series data integrity in term of accuracy and efficiency


Generic Approach to Visualization of Time Series Data

Time series is a collection of data instances that are ordered according...

Pattern Sampling for Shapelet-based Time Series Classification

Subsequence-based time series classification algorithms provide accurate...

A Feature Selection Method for Multi-Dimension Time-Series Data

Time-series data in application areas such as motion capture and activit...

Grouped self-attention mechanism for a memory-efficient Transformer

Time-series data analysis is important because numerous real-world tasks...

An Ensemble method for Content Selection for Data-to-text Systems

We present a novel approach for automatic report generation from time-se...

MinMaxLTTB: Leveraging MinMax-Preselection to Scale LTTB

Visualization plays an important role in analyzing and exploring time se...

Learning Time Series from Scale Information

Sequentially obtained dataset usually exhibits different behavior at dif...

Please sign up or login with your details

Forgot password? Click here to reset