Consider:
constexpr char s1[] = "a";
constexpr char s2[] = "abc";
std::memcmp(s1, s2, 3);
If memcmp
stops at the first difference it sees, it will not read past the second byte of s1 (the nul terminator), however I don't see anything in the C standard to confirm this behavior, and I don't know of anything in C++ which extends it.
n1570 7.24.4.1 PDF link
int memcmp(const void *s1, const void *s2, size_t n);
The
memcmp
function compares the firstn
characters of the object pointed to bys1
to the first n characters of the object pointed to bys2
Is my understanding correct that the standard describes the behavior as reading all n
bytes of both arguments, but libraries can short circuit as-if they did?
The function is not guaranteed to short-circuit because the standard doesn't say it must.
Not only is it not guaranteed to short-circuit, but in practice many implementations will not. For example, glibc compares elements of type unsigned long int
(except for the last few bytes), so it could read up to 7 bytes past the location which compared differently on a 64-bit implementation.
Some may think that this won't cause an access violation on the platforms glibc targets, because access to these unsigned long int
s will always be aligned and therefore will not cross a page boundary. But when the two sources have a different alignment, glibc will read two consecutive unsigned long int
s from one of the sources, which may be in different pages. If the different byte was in the first of those, an access violation can still be triggered before glibc performed the comparison (see function memcmp_not_common_alignment
).
In short: Specifying a length that is larger than the real size of the buffer is undefined behavior even if the different byte occured before this length, and can cause crashes on common implementations.
Here's proof that it can crash: https://ideone.com/8jTREr
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With