Near-Optimal Average-Case Approximate Trace Reconstruction from Few Traces
In the standard trace reconstruction problem, the goal is to exactly reconstruct an unknown source string ๐โ{0,1}^n from independent "traces", which are copies of ๐ that have been corrupted by a ฮด-deletion channel which independently deletes each bit of ๐ with probability ฮด and concatenates the surviving bits. We study the approximate trace reconstruction problem, in which the goal is only to obtain a high-accuracy approximation of ๐ rather than an exact reconstruction. We give an efficient algorithm, and a near-matching lower bound, for approximate reconstruction of a random source string ๐โ{0,1}^n from few traces. Our main algorithmic result is a polynomial-time algorithm with the following property: for any deletion rate 0 < ฮด < 1 (which may depend on n), for almost every source string ๐โ{0,1}^n, given any number M โคฮ(1/ฮด) of traces from Del_ฮด(๐), the algorithm constructs a hypothesis string ๐ that has edit distance at most n ยท (ฮด M)^ฮฉ(M) from ๐. We also prove a near-matching information-theoretic lower bound showing that given M โคฮ(1/ฮด) traces from Del_ฮด(๐) for a random n-bit string ๐, the smallest possible expected edit distance that any algorithm can achieve, regardless of its running time, is n ยท (ฮด M)^O(M).
READ FULL TEXT