Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using for loop to get the Hamming distance between 2 strings

In this task i need to get the Hamming distance (the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different - from Wikipedia) between the two strings sequence1 and sequence2.

First i made 2 new strings which is the 2 original strings but both with lowered case to make comparing easier. Then i resorted to using the for loop and if to compare the 2 strings. For any differences in characters in these 2 pair of string, the loop would add 1 to an int x = 0. The returns of the method will be the value of this x.

public static int getHammingDistance(String sequence1, String sequence2) {
    int a = 0;
    String sequenceX = sequence1.toLowerCase();
    String sequenceY = sequence2.toLowerCase();
    for (int x = 0; x < sequenceX.length(); x++) {
        for (int y = 0; y < sequenceY.length(); y++) {
            if (sequenceX.charAt(x) == sequenceY.charAt(y)) {
                a += 0;
            } else if (sequenceX.charAt(x) != sequenceY.charAt(y)) {
                a += 1;
            }
        }
    }
    return a;
}

So does the code looks good and functional enough? Anything i could to fix or to optimize the code? Thanks in advance. I'm a huge noob so pardon me if i asked anything silly

like image 455
Doh Avatar asked Apr 28 '13 07:04

Doh


2 Answers

From my point the following implementation would be ok:

public static int getHammingDistance(String sequence1, String sequence2) {
    char[] s1 = sequence1.toCharArray();
    char[] s2 = sequence2.toCharArray();

    int shorter = Math.min(s1.length, s2.length);
    int longest = Math.max(s1.length, s2.length);

    int result = 0;
    for (int i=0; i<shorter; i++) {
        if (s1[i] != s2[i]) result++;
    }

    result += longest - shorter;

    return result;
}
  1. uses array, what avoids the invocation of two method (charAt) for each single char that needs to be compared;
  2. avoid exception when one string is longer than the other.
like image 129
Francisco Spaeth Avatar answered Oct 20 '22 01:10

Francisco Spaeth


your code is completely off. as you said yourself, the distance is the number of places where the strings differ - so you should only have 1 loop, going over both strings at once. instead you have 2 nested loops that compare every index in string a to every index in string b.

also, writing an if condition that results in a+=0 is a waste of time.

try this instead:

for (int x = 0; x < sequenceX.length(); x++) { //both are of the same length
    if (sequenceX.charAt(x) != sequenceY.charAt(x)) {
        a += 1;
    }
}

also, this is still a naive approach which will probbaly not work with complex unicode characters (where 2 characters can be logically equal yet not have the same character code)

like image 23
radai Avatar answered Oct 20 '22 01:10

radai