Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"Find substring in char[]" getting unexpected results

Disclaimer: This is a bit of a homework question. I'm attempting to write a contains(java.lang.String subString) method , that returns an int value representing the index of the comparison string within the primary string, for a custom-made String class.

Some of the rules:

  • No collection classes
  • Only charAt() and toCharArray() are allowed from the java String class (but methods from other classes are allowed)
  • Assume length() returns the length of the primary string (which is exactly what it does)

My Code:

public int contains(java.lang.String subString) {
    this.subString = subString;
    char[] arrSubStr = this.subString.toCharArray();
    //Create initial fail
    int index = -1;
    //Make sure comparison subString is the same length or shorter than the primary string
    if(arrSubStr.length > length()) {
        return index;
    }
    //Steps to perform if initial conditions are met
    else {
        //Compare first character of subString to each character in primary string
        for(int i = 0; i < length(); i++) {
            //When a match is found...
            if(arrSubStr[0] == this.content[i]) {
                //...make sure that the subString is not longer than the remaining length of the primary string
                if(arrSubStr.length > length() - i) {
                    return index;
                }
                //Proceed matching remainder of subString
                else {
                    //Record the index of the beginning of the subString contained in primary string
                    index = i;
                    //Starting with second character of subString...
                    for(int j = 1; j < arrSubStr.length;) {
                        //...compare with subsequent chars of primary string, 
                        //and if a failure of match is found, reset index to failure (-1)
                        if(arrSubStr[j] != this.content[j+i]) {
                            index = -1;
                            return index;
                        }
                        //If we get here, it means whole subString match found
                        //Return the index (=i) we set earlier
                        else {
                            return index;
                        }
                    }
                }
            }
        }
    }
return index;
}

Results from testing:

Primary string: asdfg
Comparison string: donkey
Result: -1 [PASS]

Primary string: asdfg
Comparison string: asdfg
Result: 0 [PASS]

Primary string: asdfg
Comparison string: g
Result: 4 [PASS]

Primary string: asasasf
Comparison string: asd
Result: 0 [FAIL] (should be -1)

Primary string: asasasf
Comparison string: asf
Result: 0 [FAIL] (should be 4)

The comments reflect how the code is intended to work. However its clear that when it reaches the second for loop, the logic is breaking down somehow to give the results above. But I can't see the problem. Could I get a second set of eyes on this?

like image 333
Soundscape Avatar asked Feb 15 '19 04:02

Soundscape


2 Answers

//If we get here, it means whole subString match found
//Return the index (=i) we set earlier
else {
    return index;
}

This assumption is not correct unfortunately. If you get there, it means that the second character of both substrings are identical since the if-else statement will only get executed once and both ends contains a return.

The way to solve this is probably easy now that I've diagnosed the problem but I want to go a bit further with this. The way we try to write code on a daily basis is a way in which the code we use can be maintainable, reusable and testable.

This means basically that the function we have here could be easily sliced up in different little functions invoked one after the other for which we could write unit tests and receive a quick feedback on whether a set of logical statements fit or not.

like image 173
Yassin Hajaj Avatar answered Nov 04 '22 11:11

Yassin Hajaj


With suggestions from Jai and azurefrog in the comments, I was able to solve the issues by re-writing the logic to the following (somewhat abridged):

    if(arrSubStr.length > length()) {
        return index;
    }
    //Steps to perform if initial conditions are met
    else {
        //Compare first character of subString to each character in primary string
        for(int i = 0; i < length(); i++) {
            //When a match is found...
            if(arrSubStr[0] == this.content[i]) {
                //...make sure that the subString is not longer than the remaining length of the primary string
                if(arrSubStr.length <= length() - i) {
                    //Record the index of the beginning of the subString contained in primary string
                    index = i;
                    //Starting with second character of subString...
                    for(int j = 1; j < arrSubStr.length; j++) {
                        //...compare with subsequent chars of primary string, 
                        //and if a failure of match is found, reset index to failure (-1)
                        if(arrSubStr[j] != this.content[j+i]) {
                            index = -1;
                            break;
                        }
                    }
                }
            }
        }
    }
return index;

Essentially, I removed all of the return statements from within the loops. Simply setting the index value appropriately and making use of the final (outside) return statement was, in hindsight, the correct way to approach the problem. I then also added a break; to the inner for loop to make sure that a failure to match would continue the loop ticking through. I'm sure there's still unnecessary code in there, but while its still passing the requisite tests, I'm encouraged to leave it the hell alone. :)
I'm still a novice at Java, so I hope this explanation made sense.

like image 38
Soundscape Avatar answered Nov 04 '22 11:11

Soundscape