How does strlen()
work internally? Are there any inherent bugs in the function?
The strlen() function calculates the length of a given string. The strlen() function takes a string as an argument and returns its length. The returned value is of type size_t (an unsigned integer type). It is defined in the <string.
The strlen() built-in function determines the length of string pointed to by string, excluding the terminating NULL character.
The syntax of the strlen() function in C is as follows: int strlen(const char *str); In the above syntax, str refers to the string whose length has to be calculated.
strlen
usually works by counting the characters in a string until a \0
character is found. A canonical implementation would be:
size_t strlen (char *str) {
size_t len = 0;
while (*str != '\0') {
str++;
len++;
}
return len;
}
As for possible inherent bugs in the function, there are none - it works exactly as documented. That's not to say it doesn't have certain problems, to wit:
\0
at the end, you may run into problems but technically, that's not a C string (a) and it's your own fault.\0
characters within your string but, again, it wouldn't be a C string in that case.But none of these are bugs, they're just consequences of a design decision.
On that last bullet point, see also this excellent article by Joel Spolsky where he discusses various string formats and their characteristics, including normal C strings (with a terminator), Pascal strings (with a length) and the combination of the two, null terminated Pascal strings.
Though he has a more, shall we say, "colorful" term for that final type, one which frequently comes to mind whenever I thing of Python's excellent (and totally unrelated) f-strings :-)
(a) A C string is defined as a series of non-terminator characters (any character other than \0
) followed by a terminator. Hence this definition disallows both embedded terminators within the sequence, and sequences without such a terminator. Or, putting it more succinctly (as per the ISO C standard):
A string is a contiguous sequence of characters terminated by and including the first null character.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With