A Latent Source Model for Nonparametric Time Series Classification

by   George H. Chen, et al.

For classifying time series, a nearest-neighbor approach is widely used in practice with performance often competitive with or better than more elaborate methods such as neural networks, decision trees, and support vector machines. We develop theoretical justification for the effectiveness of nearest-neighbor-like classification of time series. Our guiding hypothesis is that in many applications, such as forecasting which topics will become trends on Twitter, there aren't actually that many prototypical time series to begin with, relative to the number of time series we have access to, e.g., topics become trends on Twitter only in a few distinct manners whereas we can collect massive amounts of Twitter data. To operationalize this hypothesis, we propose a latent source model for time series, which naturally leads to a "weighted majority voting" classification rule that can be approximated by a nearest-neighbor classifier. We establish nonasymptotic performance guarantees of both weighted majority voting and nearest-neighbor classification under our model accounting for how much of the time series we observe and the model complexity. Experimental results on synthetic data show weighted majority voting achieving the same misclassification rate as nearest-neighbor classification while observing less of the time series. We then use weighted majority to forecast which news topics on Twitter become trends, where we are able to detect such "trending topics" in advance of Twitter 79 with a mean early advantage of 1 hour and 26 minutes, a true positive rate of 95


page 1

page 2

page 3

page 4


Distributed Nearest Neighbor Classification

Nearest neighbor is a popular nonparametric method for classification an...

Asymmetric Learning Vector Quantization for Efficient Nearest Neighbor Classification in Dynamic Time Warping Spaces

The nearest neighbor method together with the dynamic time warping (DTW)...

A Latent Source Model for Patch-Based Image Segmentation

Despite the popularity and empirical success of patch-based nearest-neig...

Generalized Linear Tree Space Nearest Neighbor

We present a novel method of stacking decision trees by projection into ...

Explainable time series tweaking via irreversible and reversible temporal transformations

Time series classification has received great attention over the past de...

Prediction of Success or Failure for Final Examination using Nearest Neighbor Method to the Trend of Weekly Online Testing

Using the trends of estimated abilities in terms of item response theory...

Stochastic temporal data upscaling using the generalized k-nearest neighbor algorithm

Three methods of temporal data upscaling, which may collectively be call...

Please sign up or login with your details

Forgot password? Click here to reset