Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Longest common subsequence (LCS) brute force algorithm

I want to create a brute force algorithm to find the largest common subsequence between 2 strings, but I'm struggling to enumerate all possibilities in the form of a algorithm.

I don't want a dynamic programming answer as oddly enough I manage to figure this one out (You would think the brute force method would be easier). Please use pseudo code, as I prefer to understand it and write it up myself.

like image 582
Yahya Uddin Avatar asked May 19 '26 02:05

Yahya Uddin


2 Answers

It's pretty much the same as DP minus the memoization part.

LCS(s1, s2, i, j):
    if(i == -1 || j == -1)
        return 0
    if(s1[i] == s2[j])
        return 1 + LCS(s1, s2, i-1, j-1)
    return max(LCS(s1, s2, i-1, j), LCS(s1, s2, i, j-1))

The idea is if we have two strings s1 and s2 where s1 ends at i and s2 ends at j, then the LCS is:

  • if either string is empty, then the longest common subsequence is 0.
  • If the last character (index i) of string 1 is the same as the last one in string 2 (index j), then the answer is 1 plus the LCS of s1 and s2 ending at i-1 and j-1, respectively. Because it's obvious that those two indices contribute to the LCS, so it's optimal to count them.
  • If the last characters don't match, then we try to remove one of the characters. So we try finding LCS between s1 (ending at i-1) and s2 (ending at j) and the LCS between s1 (ending at i) and s2 (ending at j-1), then take the max of both.

The time complexity is obviously exponential.

like image 176
turingcomplete Avatar answered May 21 '26 20:05

turingcomplete


I like @turingcomplete's answer but it's not really brute-force since it doesn't actually enumerate all candidate solutions. For example, if the strings are "ABCDE" and "XBCDY", the recursive approach won't test for "ABC" versus "XBC" because the test for "A" versus "X" will have already failed. It's kind of a matter of opinion whether you want to count that as a unique candidate though. In fact, you could argue that "ABC" versus "ABCDY" is a valid candidate as well, and just immediately fails due to length difference. You could add separate LA and LB to the code below to fully enumerate those candidates though.

for L = min(A.length, B.length) to 1
{
    for iA = 0 to A.length - L - 1
    {
        for iB = 0 to B.length - L - 1
        {
            for i = 0 to L - 1
            {
                if(A[iA] != B[iB])
                {
                    match failed;
                }
            }
            if match didn't fail, then
            A[iA..iA+L] and B[iB..iB+L] are a maximal common substring
        }
     }
}
no common substring
like image 45
MooseBoys Avatar answered May 21 '26 21:05

MooseBoys



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!