Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a particular reason for memmem being a GNU extension?

Tags:

c

memory

gnu

In C, the memmem function is used to locate a particular sequence of bytes in a memory area. It can be assimilated to strstr, which is dedicated to null-terminated strings.

Is there any particular reason for this function to be available as a GNU extension, and not directly in the standard libraries? The manual states :

This function was broken in Linux libraries up to and including libc 5.0.9; there the needle and haystack arguments were interchanged, and a pointer to the end of the first occurrence of needle was returned.

Both old and new libc's have the bug that if needle is empty, haystack-1 (instead of haystack) is returned. And glibc 2.0 makes it worse, returning a pointer to the last byte of haystack. This is fixed in glibc 2.1.

I can see it went through several fixes, yet I'd like to know why it was not made as directly available (if not more) as strstr on some distributions. Does it still bring up implementation issues?

Edit (motivations): I wouldn't ask this question if the standard had decided it the other way around: including memmem but not strstr. Indeed, strstr could be something like:

memmem(str, strlen(str), "search", 6);

Slightly trickier, but still a pretty logical one-liner considering that it is very usual in C functions to require both the data chunk and its length.

Edit (2): another motivation from comments and answers. Quoting Theolodis:

Not every function is necessary to every single, or at least most of the C developers, so it would actually make the standard libraries unnecessarily huge.

Well, I couldn't agree more, I'm always in when it comes to making the librairies lighter and faster. But then... why both strncpy and memcpy (from keltar's comment)...? I could almost ask: why has poor memmem been "black-sheeped"?

like image 439
John WH Smith Avatar asked Jun 17 '14 12:06

John WH Smith


1 Answers

Historically, that is before the first revision of the Standard, C has been made by compiler writers.

In the case of strstr, it is a little bit different because it has been introduced by the C Committee, the C89 Rationale document tells us that:

"The strstr function is an invention of the Committee. It is included as a hook for efficient algorithms, or for built-in substring instruction."

The C Committee does not explain why it has not made a more general function not limited to strings so any reasoning may only be speculation. My only guess is the use case has been considered not important enough to have a generic memmem instead of strstr. Remember that in the goals of C there is this requirement (in the C99 Rationale) "Keep the language small and simple". Also even POSIX didn't consider it for inclusion.

In any case to my knowledge nobody has proposed any Defect Report or proposal to have memmem included.

like image 161
ouah Avatar answered Oct 31 '22 15:10

ouah