Duplication with transposition distance to the root for q-ary strings

01/17/2020
by   Nikita Polyanskii, et al.
0

We study the duplication with transposition distance between strings of length n over a q-ary alphabet and their roots. In other words, we investigate the number of duplication operations of the form x = (abcd) → y = (abcbd), where x and y are strings and a, b, c and d are their substrings, needed to get a q-ary string of length n starting from the set of strings without duplications. For exact duplication, we prove that the maximal distance between a string of length at most n and its root has the asymptotic order n/log n. For approximate duplication, where a β-fraction of symbols may be duplicated incorrectly, we show that the maximal distance has a sharp transition from the order n/log n to log n at β=(q-1)/q. The motivation for this problem comes from genomics, where such duplications represent a special kind of mutation and the distance between a given biological sequence and its root is the smallest number of transposition mutations required to generate the sequence.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset