Codes Correcting a Single Long Duplication Error
We consider the problem of constructing a code capable of correcting a single long tandem duplication error of variable length. As the main contribution of this paper, we present a q-ary efficiently encodable code of length n+1 and redundancy 1 that can correct a single duplication of length at least K=4·⌈log_q n⌉ +1. The complexity of encoding is O(n^2/log n) and the complexity of decoding is O(n). We also present a q-ary non-efficient code of length n+1 correcting single long duplication of length at least K = ⌈log_q n⌉ +ϕ(n), where ϕ(n)→∞ as n→∞. This code has redundancy less than 1 for sufficiently large n. Moreover, we show that in the class of codes correcting a single long duplication with redundancy 1, the value K in our constructions is order-optimal.
READ FULL TEXT