Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does the strlen function work internally?

Tags:

c

strlen

How does strlen() work internally? Are there any inherent bugs in the function?

like image 493
Manu Avatar asked Nov 09 '10 10:11

Manu


People also ask

How does strlen function work?

The strlen() function calculates the length of a given string. The strlen() function takes a string as an argument and returns its length. The returned value is of type size_t (an unsigned integer type). It is defined in the <string.

Is strlen built in function?

The strlen() built-in function determines the length of string pointed to by string, excluding the terminating NULL character.

What is the correct syntax for strlen () in C?

The syntax of the strlen() function in C is as follows: int strlen(const char *str); In the above syntax, str refers to the string whose length has to be calculated.


1 Answers

strlen usually works by counting the characters in a string until a \0 character is found. A canonical implementation would be:

size_t strlen (char *str) {
    size_t len = 0;
    while (*str != '\0') {
        str++;
        len++;
    }
    return len;
}

As for possible inherent bugs in the function, there are none - it works exactly as documented. That's not to say it doesn't have certain problems, to wit:

  • if you pass it a "string" that doesn't have a \0 at the end, you may run into problems but technically, that's not a C string (a) and it's your own fault.
  • you can't put \0 characters within your string but, again, it wouldn't be a C string in that case.
  • it's not the most efficient way - you could store a length up front so you could get the length much quicker.

But none of these are bugs, they're just consequences of a design decision.

On that last bullet point, see also this excellent article by Joel Spolsky where he discusses various string formats and their characteristics, including normal C strings (with a terminator), Pascal strings (with a length) and the combination of the two, null terminated Pascal strings.

Though he has a more, shall we say, "colorful" term for that final type, one which frequently comes to mind whenever I thing of Python's excellent (and totally unrelated) f-strings :-)


(a) A C string is defined as a series of non-terminator characters (any character other than \0) followed by a terminator. Hence this definition disallows both embedded terminators within the sequence, and sequences without such a terminator. Or, putting it more succinctly (as per the ISO C standard):

A string is a contiguous sequence of characters terminated by and including the first null character.

like image 125
paxdiablo Avatar answered Sep 21 '22 06:09

paxdiablo