Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding longest common subsequence in O(NlogN) time

Is there any way of finding the longest common subsequence of two sequences in O(NlogN) time?

I read somewhere that there is a way to achieve this using binary search.

I know the dp approach that takes O(N2) time.

like image 661
LTim Avatar asked Jun 10 '15 22:06

LTim


People also ask

What is the time complexity of the longest common subsequence?

The general algorithms which are followed to solve the Longest Common Subsequence (LCS) problems have both time complexity and space complexity of O(m * n).

How do you identify the LCS longest common subsequence?

The value in the last row and the last column is the length of the longest common subsequence. In order to find the longest common subsequence, start from the last element and follow the direction of the arrow.

How do you find the longest subsequence?

To find the LIS for a given array, we need to return max(L(i)) where 0 < i < n. Formally, the length of the longest increasing subsequence ending at index i, will be 1 greater than the maximum of lengths of all longest increasing subsequences ending at indices before i, where arr[j] < arr[i] (j < i).

What is the time complexity of the longest common subsequence problem where the length of one string is m and the length of the other string is n?

Since we are using two for loops for both the strings ,therefore the time complexity of finding the longest common subsequence using dynamic programming approach is O(n * m) where n and m are the lengths of the strings.


3 Answers

For the general case, the O(N^2) dynamic programming algorithm is the best you can do. However, there exist better algorithms in some special cases.

  1. Alphabet size is bounded

This is a very common situation. Sequences consisting of letters from some alphabet (e.g. English) lie in this category. For this case, the O(N*M) algorithm can be optimised to get an O(N^2/logN) with method of four Russians. I don't know how exactly, you can search for it.

  1. Both sequences consist of distinct elements

An example problem is "Given two permutations of numbers from 1 to N, find their LCS". This one can be solved in O(N*logN).
Let the sequences be A and B.
Define a sequence C. C[i] is the index of B[i] in A. (A[C[i]] = B[i])
LCS of A and B is the longest increasing subsequence of C.

like image 54
sparik Avatar answered Oct 08 '22 23:10

sparik


The dynamic programming approach, which is O(n2) for general case. For certain other cases, there are lower-complexity algorithms:

  • For a fixed alphabet size (which doesn't grow with n), there's the Method of Four Russians which brings the time down to O(n2/log n) (see here).

  • See here another further optimized case.

like image 22
Ami Tavory Avatar answered Oct 09 '22 00:10

Ami Tavory


Assuming Exponential Time Hypothesis (which is stricter than P is not equal to NP, but is still widely believed to be true), it is not possible to do it in time O(N^{2 - eps}) for any positive constant eps, see "Quadratic Conditional Lower Bounds for String Problems and Dynamic Time Warping" by Karl Bringmann and Marvin Kunnemann (pre-print on arXiv is available).

Roughly speaking, it means that the general case of this problem cannot be solved in time better than something like O(N^2/log N), so if you want faster algorithms you have to consider additional constraints (some properties of the strings) or look for approximate solution.

like image 37
Andrew Ryzhikov Avatar answered Oct 09 '22 01:10

Andrew Ryzhikov