Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it legal to call memchr with a too-long length, if you know the character will be found before reaching the end of the valid region?

Is the following defined behavior in C11 and C++111?

bool has4() {  
    char buf[10] = {0, 1, 2, 4};
    return memchr(buf, 4, 20);
}

Here we pass a too-long length to memchr. The array has 10 elements but we pass 20. The element we are searching for, however, is always found before the end. It is clear to me if this is legal.

If this is allowed, it would limit implementation flexibility, since the implementation cannot rely on the size being a valid indication of the size of the accessible memory region and hence must be careful about reading beyond the found element. An example would be an implementation that wants to do a 16 byte SIMD load starting at the passed-in pointer and then check all the 16 bytes in parallel. If the user passes a length of 16, this would be safe only if the entire length was required to be accessible.

Otherwise (if the above code is legal) the implementation must avoid potentially faulting on elements past the target element, for example by aligning the load (potentially expensive) or checking if the pointer is near the end of a protection boundary.


1 Here's one of those rare questions where I guess tagging both C and C++ is valid: as far as I can tell the C++ standard just defers directly to the C standard here, via reference, in terms of behavior, but if that's not the case I want to know.

like image 771
BeeOnRope Avatar asked Nov 15 '17 19:11

BeeOnRope


1 Answers

In C11 and C++17 (emphasis mine)

void *memchr(const void *s, int c, size_t n);

The memchr function locates the first occurrence of c (converted to an unsigned char) in the initial n characters (each interpreted as unsigned char) of the object pointed to by s. The implementation shall behave as if it reads the characters sequentially and stops as soon as a matching character is found.

So long as memchr finds what it's looking for before you step out of bounds, you're okay.


C++11 and C++14 both use C99, which doesn't have such wording. (They refer to ISO/IEC 9899:1999)

C99 wording:

void *memchr(const void *s, int c, size_t n);

The memchr function locates the first occurrence of c (converted to an unsigned char) in the initial n characters (each interpreted as unsigned char) of the object pointed to by s.

By not defining what happens if you pass too large of a size, the behavior is undefined in C99

like image 100
AndyG Avatar answered Nov 16 '22 04:11

AndyG