Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compare strings in java and remove the part of string where they are identical

Tags:

java

algorithm

I have two strings with me:

s1="MICROSOFT"
s2="APPLESOFT"

I need to compare the strings and remove the duplicate part (always towards the end) from the second string. So I should get "MICROSOFT" and "APPLE" as output.

I have compared both the strings character by character.

               String s1 = "MICROSOFT";
               String s2 = "APPLESOFT";

               for(int j=0; j<s1.length(); j++)
               {
                   char c1 = s1.charAt(j);
                   char c2 = s2.charAt(j);

                   if(c1==c2)
                       System.out.println("Match found!!!");
                   else
                       System.out.println("No match found!");
               }

It should check the strings and if the two strings have same characters until the end of string, then I need to remove that redundant part, SOFT in this case, from the second string. But I can't think of how to proceed from here.

There can be more duplicates...but we have to remove only those which are continuously identical. if i have APPWWSOFT and APPLESOFT, i should get APPLE again in the second string since we got LE different than WW in between

Can you guys please help me out here?

like image 969
Snow Leopard Avatar asked Sep 27 '12 11:09

Snow Leopard


1 Answers

Search and read about Longest Common Subsequence, you can find efficient algorithms to find out the LCS of two input strings. After finding the LCS of the input strings, it is easy to manipulate the inputs. For example, in your case an LCS algorithm will find "SOFT" as the LCS of these two strings, then you might check whether the LCS is in the final part of the 2nd input and then remove it easily. I hope this idea helps.

An example LCS code in Java is here, try it: http://introcs.cs.princeton.edu/java/96optimization/LCS.java.html

Example scenario (pseudocode):

input1: "MISROSOFT";
input2: "APPLESOFT";

execute LCS(input1, input2);
store the result in lcs, now lcs = "SOFT";

iterate over the characters of input2,
if a character exists in lcs then remove it from input2.
like image 109
Juvanis Avatar answered Nov 03 '22 02:11

Juvanis