In C, the memmem
function is used to locate a particular sequence of bytes in a memory area. It can be assimilated to strstr
, which is dedicated to null-terminated strings.
Is there any particular reason for this function to be available as a GNU extension, and not directly in the standard libraries? The manual states :
This function was broken in Linux libraries up to and including libc 5.0.9; there the needle and haystack arguments were interchanged, and a pointer to the end of the first occurrence of needle was returned.
Both old and new libc's have the bug that if needle is empty, haystack-1 (instead of haystack) is returned. And glibc 2.0 makes it worse, returning a pointer to the last byte of haystack. This is fixed in glibc 2.1.
I can see it went through several fixes, yet I'd like to know why it was not made as directly available (if not more) as strstr
on some distributions. Does it still bring up implementation issues?
Edit (motivations): I wouldn't ask this question if the standard had decided it the other way around: including memmem
but not strstr
. Indeed, strstr
could be something like:
memmem(str, strlen(str), "search", 6);
Slightly trickier, but still a pretty logical one-liner considering that it is very usual in C functions to require both the data chunk and its length.
Edit (2): another motivation from comments and answers. Quoting Theolodis:
Not every function is necessary to every single, or at least most of the C developers, so it would actually make the standard libraries unnecessarily huge.
Well, I couldn't agree more, I'm always in when it comes to making the librairies lighter and faster. But then... why both strncpy
and memcpy
(from keltar's comment)...? I could almost ask: why has poor memmem
been "black-sheeped"?
Historically, that is before the first revision of the Standard, C has been made by compiler writers.
In the case of strstr
, it is a little bit different because it has been introduced by the C Committee, the C89 Rationale document tells us that:
"The
strstr
function is an invention of the Committee. It is included as a hook for efficient algorithms, or for built-in substring instruction."
The C Committee does not explain why it has not made a more general function not limited to strings so any reasoning may only be speculation. My only guess is the use case has been considered not important enough to have a generic memmem
instead of strstr
. Remember that in the goals of C there is this requirement (in the C99 Rationale) "Keep the language small and simple". Also even POSIX didn't consider it for inclusion.
In any case to my knowledge nobody has proposed any Defect Report or proposal to have memmem
included.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With