Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Longest Common Subsequence for Multiple Sequences

I have done a bunch of research for finding the longest for M = 2 sequences, but I am trying to figure out how to do it for M ≥ 2 sequences

I am being given N and M: M sequences, with N unique elements. N is the set of {1 - N}. I have thought about the dynamic programming approach, but I am still confused as to how to actually incorporate it.

Example input

5 3
5 3 4 1 2
2 5 4 3 1
5 2 3 1 4

The max sequence here can be seen to be

5 3 1

Expected output

Length = 3
like image 515
mkobit Avatar asked Apr 22 '11 04:04

mkobit


People also ask

Which algorithm is used for longest common subsequence?

In this example, we have two strings X = BACDB and Y = BDCB to find the longest common subsequence. Following the algorithm LCS-Length-Table-Formulation (as stated above), we have calculated table C (shown on the left hand side) and table B (shown on the right hand side).

Which of the following is the longest common subsequence?

Explanation: The longest common subsequence is “PRTPQRS” and its length is 7.

Is longest common subsequence NP hard?

Complexity. For the general case of an arbitrary number of input sequences, the problem is NP-hard.

How do you find the longest common subsequence in two strings in C++?

Let the input sequences be X[0.. m-1] and Y[0..n-1] of lengths m and n respectively. And let L(X[0.. m-1], Y[0..n-1]) be the length of LCS of the two sequences X and Y.


1 Answers

A simple idea.

For each number i between 1 and N, calculate the longest subsequence where the last number is i. (Let's call it a[i])

To do that, we'll iterate over numbers i in the first sequence from start to end. If a[i] > 1, then there's number j such that in each sequence it comes before i.
Now we can just check all possible values of j and (if previous condition holds) do a[i] = max(a[i], a[j] + 1).

As the last bit, because j comes before i in first sequence, it means a[j] is already calculated.

for each i in first_sequence
    // for the OP's example, 'i' would take values [5, 3, 4, 1, 2], in this order
    a[i] = 1;
    for each j in 1..N
        if j is before i in each sequence
            a[i] = max(a[i], a[j] + 1)
        end
    end
end

It's O(N^2*M), if you calculate matrix of positions beforehand.

like image 64
Nikita Rybak Avatar answered Nov 08 '22 11:11

Nikita Rybak