Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to write a better strlen function?

I am reading "Write Great Code Volume 2" and it shows the following strlen impelementation:

int myStrlen( char *s )
{
    char *start;
    start = s;
    while( *s != 0 )
    {
        ++s;
    }
    return s - start;
}

the book says that this implementation is typical for an inexperienced C programmer. I have been coding in C for the past 11 years and i can't see how to write a function better than this in C(i can think of writing better thing in assembly). How is it possible to write code better than this in C? I looked the standard library implementation of the strlen function in glibc and I couldn't understand most part of it. Where can I find better information on how to write highly optimized code?

like image 380
Victor Avatar asked Jul 05 '11 14:07

Victor


People also ask

What can I use instead of strlen?

strlen() in C-style strings can be replaced by C++ std::strings. sizeof() in C is as an argument to functions like malloc(), memcpy() or memset() can be replaced by C++ (use new, std::copy(), and std::fill() or constructors).

What is strlen function with example?

The strlen() function calculates the length of a given string. The strlen() function takes a string as an argument and returns its length. The returned value is of type size_t (an unsigned integer type). It is defined in the <string. h> header file.

Can strlen fail?

Ok, I need to add some explanation. My application is getting a string from a shared memory (which is of some length), therefore it could be represented as an array of characters. If there is a bug in the library writing this string, then the string would not be zero terminated, and the strlen could fail.


3 Answers

From Optimising strlen(), a blogpost by Colm MacCarthaigh:

Unfortunately in C, we’re doomed to an O(n) implementation, best case, but we’re still not done … we can do something about the very size of n.

It gives good example in what direction you can work to speed it up. And another quote from it

Sometimes going really really fast just makes you really really insane.

like image 143
Mojo Risin Avatar answered Oct 08 '22 04:10

Mojo Risin


Victor, take a look at this:
http://en.wikipedia.org/wiki/Strlen#Implementation

P.S. The reason you don't understand the glibc version is probably because it uses bit shifting to find the \0.

like image 26
gkrogers Avatar answered Oct 08 '22 03:10

gkrogers


For starters, this is worthless for encodings like UTF-8... that is, calculating the number of characters in an UTF-8 string is more complicated, whereas the number of bytes is, of course, just as easy to calculate as in, say, an ASCII string.

In general, you can optimize on some platforms by reading into larger registers. Since the other links posted so far don't have an example of that, here's a bit of pseudo-pseudocode for lower endian:

int size = 0;
int x;
int *caststring = (int *) yourstring;
while (int x = *caststring++) {
  if (!(x & 0xff)) /* first byte in this int-sized package is 0 */ return size;
  else if (!(x & 0xff00)) /* second byte etc. */ return size+1;
  /* rinse and repeat depending on target architecture, i.e. twice more for 32 bit */
  size += sizeof (int);
}
like image 41
Jan Krüger Avatar answered Oct 08 '22 04:10

Jan Krüger