Exploring the space-time pattern of log-transformed infectious count of COVID-19: a clustering-segmented autoregressive sigmoid model
At the end of April 20, 2020, there were only a few new COVID-19 cases remaining in China, whereas the rest of the world had shown increases in the number of new cases. It is of extreme importance to develop an efficient statistical model of COVID-19 spread, which could help in the global fight against the virus. We propose a clustering-segmented autoregressive sigmoid (CSAS) model to explore the space-time pattern of the log-transformed infectious count. Four key characteristics are included in this CSAS model, including unknown clusters, change points, stretched S-curves, and autoregressive terms, in order to understand how this outbreak is spreading in time and in space, to understand how the spread is affected by epidemic control strategies, and to apply the model to updated data from an extended period of time. We propose a nonparametric graph-based clustering method for discovering dissimilarity of the curve time series in space, which is justified with theoretical support to demonstrate how the model works under mild and easily verified conditions. We propose a very strict purity score that penalizes overestimation of clusters. Simulations show that our nonparametric graph-based clustering method is faster and more accurate than the parametric clustering method regardless of the size of data sets. We provide a Bayesian information criterion (BIC) to identify multiple change points and calculate a confidence interval for a mean response. By applying the CSAS model to the collected data, we can explain the differences between prevention and control policies in China and selected countries.
READ FULL TEXT