Is the following defined behavior in C11 and C++111?
bool has4() {
char buf[10] = {0, 1, 2, 4};
return memchr(buf, 4, 20);
}
Here we pass a too-long length to memchr
. The array has 10 elements but we pass 20. The element we are searching for, however, is always found before the end. It is clear to me if this is legal.
If this is allowed, it would limit implementation flexibility, since the implementation cannot rely on the size being a valid indication of the size of the accessible memory region and hence must be careful about reading beyond the found element. An example would be an implementation that wants to do a 16 byte SIMD load starting at the passed-in pointer and then check all the 16 bytes in parallel. If the user passes a length of 16, this would be safe only if the entire length was required to be accessible.
Otherwise (if the above code is legal) the implementation must avoid potentially faulting on elements past the target element, for example by aligning the load (potentially expensive) or checking if the pointer is near the end of a protection boundary.
1 Here's one of those rare questions where I guess tagging both C and C++ is valid: as far as I can tell the C++ standard just defers directly to the C standard here, via reference, in terms of behavior, but if that's not the case I want to know.
In C11 and C++17 (emphasis mine)
void *memchr(const void *s, int c, size_t n);
Thememchr
function locates the first occurrence ofc
(converted to anunsigned char
) in the initialn
characters (each interpreted asunsigned char
) of the object pointed to bys
. The implementation shall behave as if it reads the characters sequentially and stops as soon as a matching character is found.
So long as memchr
finds what it's looking for before you step out of bounds, you're okay.
C++11 and C++14 both use C99, which doesn't have such wording. (They refer to ISO/IEC 9899:1999)
C99 wording:
void *memchr(const void *s, int c, size_t n);
Thememchr
function locates the first occurrence ofc
(converted to anunsigned char
) in the initialn
characters (each interpreted asunsigned char
) of the object pointed to bys
.
By not defining what happens if you pass too large of a size, the behavior is undefined in C99
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With