# The Longest Subsequence-Repeated Subsequence Problem

Motivated by computing duplication patterns in sequences, a new fundamental problem called the longest subsequence-repeated subsequence (LSRS) is proposed. Given a sequence S of length n, a letter-repeated subsequence is a subsequence of S in the form of x_1^d_1x_2^d_2⋯ x_k^d_k with x_i a subsequence of S, x_j≠ x_j+1 and d_i≥ 2 for all i in [k] and j in [k-1]. We first present an O(n^6) time algorithm to compute the longest cubic subsequences of all the O(n^2) substrings of S, improving the trivial O(n^7) bound. Then, an O(n^6) time algorithm for computing the longest subsequence-repeated subsequence (LSRS) of S is obtained. Finally we focus on two variants of this problem. We first consider the constrained version when Σ is unbounded, each letter appears in S at most d times and all the letters in Σ must appear in the solution. We show that the problem is NP-hard for d=4, via a reduction from a special version of SAT (which is obtained from 3-COLORING). We then show that when each letter appears in S at most d=3 times, then the problem is solvable in O(n^5) time.

READ FULL TEXT