Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C Programming - Functionality of strlen

Tags:

c

string

strlen

I'm working to try and understand some string functions so I can more effectively use them in later coding projects, so I set up the simple program below:

#include <stdio.h>
#include <string.h>

int main (void)
{
// Declare variables:
char test_string[5];
char test_string2[] = { 'G', 'O', '_', 'T', 'E', 'S', 'T'};
int init; 
int length = 0;
int match;

// Initialize array:
for (init = 0; init < strlen(test_string); init++)
{    test_string[init] = '\0';
}

// Fill array:
test_string[0] = 'T';
test_string[1] = 'E';
test_string[2] = 'S';
test_string[3] = 'T';

// Get Length:
length = strlen(test_string);

// Get number of characters from string 1 in string 2:
match = strspn(test_string, test_string2);

printf("\nstrlen return = %d", length);
printf("\nstrspn return = %d\n\n", match);

return 0;
}

I expect to see a return of:

strlen return = 4 strspn return = 4

However, I see strlen return = 6 and strspn return = 4. From what I understand, char test_string[5] should allocate 5 bytes of memory and place hex 00 into the fifth byte. The for loop (which should not even be nessecary) should then set all the bytes of memory for test_string to hex 00. Then, the immediately proceeding lines should fill test_string bytes 1 through 4 (or test_string[0] through test_string[3]) with what I have specified. Calling strlen at this point should return a 4, because it should start at the address of string 0 and count an increment until it hits the first null character, which is at string[4]. Yet strlen returns 6. Can anyone explain this? Thanks!

like image 552
Ryan Barker Avatar asked Dec 11 '22 07:12

Ryan Barker


2 Answers

char test_string[5];

test_string is an array of 5 uninitialized char objects.

for (init = 0; init < strlen(test_string); init++)

Kaboom. strlen scans for the first '\0' null character. Since the contents of test_string are garbage, the behavior is undefined. It might return a small value if there happens to be a null character, or a large value or program crash if there don't happen to be any zero bytes in test_string.

Even if that weren't the case, evaluating strlen() in the header of a for loop is inefficient. Each strlen() call has to re-scan the entire string (assuming you've given it a valid string), so if your loop worked it would be O(N2).

If you want test_string to contain just zero bytes, you can initialize it that way:

char test_string[5] = "";

or, since you initialize the first 4 bytes later:

char test_string[5] = "TEST";

or just:

char test_string[] = "TEST";

(The latter lets the compiler figure out that it needs 5 bytes.)

Going back to your declarations:

char test_string2[] = { 'G', 'O', '_', 'T', 'E', 'S', 'T'};

This causes test_string2 to be 7 bytes long, without a trailing '\0' character. That means that passing test_string2 to any function that expects a pointer to a string will cause undefined behavior. You probably want something like:

char test_string2[] = "GO_TEST";
like image 109
Keith Thompson Avatar answered Dec 24 '22 23:12

Keith Thompson


strlen searches for '\0' character to count them, in your test_string, there is none so it continues until it finds one which happens to be 6 bytes away from the start of your array since it is uninitialized.

The compiler does not generate code to initialize the array so you don't have to pay to run that code if you fill it later.

To initialize it to 0 and skip the loop, you can use

char test_string[5] = {0};

This way, all character will be initialized to 0 and your strlen will work after you filled the array with "TEST".

like image 33
Eric Fortin Avatar answered Dec 24 '22 21:12

Eric Fortin