A log-linear time algorithm for constrained changepoint detection
Changepoint detection is a central problem in time series and genomic data. For some applications, it is natural to impose constraints on the directions of changes. One example is ChIP-seq data, for which adding an up-down constraint improves peak detection accuracy, but makes the optimization problem more complicated. We show how a recently proposed functional pruning technique can be adapted to solve such constrained changepoint detection problems. This leads to a new algorithm which can solve problems with arbitrary affine constraints on adjacent segment means, and which has empirical time complexity that is log-linear in the amount of data. This algorithm achieves state-of-the-art accuracy in a benchmark of several genomic data sets, and is orders of magnitude faster than existing algorithms that have similar accuracy. Our implementation is available as the PeakSegPDPA function in the coseg R package, https://github.com/tdhock/coseg
READ FULL TEXT