I'm using clang++ 10 on Ubuntu 20.04 LTS, with -fsanitize-undefined-trap-on-error -fsanitize=address,undefined,nullability,implicit-integer-truncation,implicit-integer-arithmetic-value-change,implicit-conversion,integer
My code is generating random bytes with
std::random_device rd;
std::mt19937 gen(rd());
std::uniform_int_distribution<uint8_t> dd(0, 255);
...
ch = uint8_t(dd(gen));
This last line causes the sanitizer to report undefined behavior is in bits/random.tcc
template<...> void mersenne_twister_engine<...>::
_M_gen_rand(void) {
const _UIntType __upper_mask = (~_UIntType()) << __r;
const _UIntType __lower_mask = ~__upper_mask;
for (size_t __k = 0; __k < (__n - __m); ++__k)
{
_UIntType __y = ((_M_x[__k] & __upper_mask)
| (_M_x[__k + 1] & __lower_mask));
_M_x[__k] = (_M_x[__k + __m] ^ (__y >> 1)
^ ((__y & 0x01) ? __a : 0));
}
for (size_t __k = (__n - __m); __k < (__n - 1); ++__k)
{
_UIntType __y = ((_M_x[__k] & __upper_mask)
| (_M_x[__k + 1] & __lower_mask));
_M_x[__k] = (_M_x[__k + (__m - __n)] ^ (__y >> 1) <<<<===== this line
^ ((__y & 0x01) ? __a : 0));
}
_UIntType __y = ((_M_x[__n - 1] & __upper_mask)
| (_M_x[0] & __lower_mask));
_M_x[__n - 1] = (_M_x[__m - 1] ^ (__y >> 1)
^ ((__y & 0x01) ? __a : 0));
_M_p = 0;
}
The error reads:
/usr/include/c++/10/bits/random.tcc:413:33: runtime error: unsigned integer overflow: 397 - 624 cannot be represented in type 'unsigned long'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /usr/include/c++/10/bits/random.tcc:413:33 in
/usr/include/c++/10/bits/random.tcc:413:26: runtime error: unsigned integer overflow: 227 + 18446744073709551389 cannot be represented in type 'unsigned long'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /usr/include/c++/10/bits/random.tcc:413:26 in
It appears that there is a difference __m-__n == 397 - 624
that is obviously negative but the operands are all unsigned.
The variables being subtracted are template parameters defined as size_t __n, size_t __m
so this is not a random edge case, it's the actual template being implemented.
Is this a bug in this implementation of the STL or my usage is wrong?
A minimal reproducible example: https://godbolt.org/z/vvjWscPnj
UPDATE: Issue (not a bug) filed to GCC https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106469 - closed as "WONT FIX"
GCC's team called clang's ubsan unsigned integer overflow checks bad practice, because the behaviour is well-defined (as modulo wrapping) in ISO C++. Although modulus arithmetic is used in PRNG, this is not the case in this particular case.
However in most userspace code a unsigned overflow is almost all the time a bug to be caught and this non-bug on the GCC's STL prevents users from benefitting from this useful check.
Although as the other answer indicates it is undefined behavior per standard to instantiate std::uniform_int_distribution
with uint8_t
template argument, the UBsan warning here is unrelated to that.
UBSan is flagging the implementation of the Mersenne twister itself, but the implementation doesn't have any undefined behavior or bug.
If you look closely you see that the offending expression is
_M_x[__k + (__m - __n)]
where __k
is a value in the range from (__n - __m)
to (__n - 1)
via the for
loop.
All of the types involved in these operations are std::size_t
which is unsigned. As a consequence these operations all use modular arithmetic and therefore even if __m - __n
is negative and not representable in the unsigned type, the result of
__k + (__m - __n)
will lie between 0
and __m - 1
, so that indexing the array with it is not a problem. No undefined behavior, unspecified behavior or implementation-defined behavior is involved.
The UBSan check which is flagging this is not flagging actual undefined behavior. It is perfectly ok to rely on the wrap-around behavior of unsigned arithmetic like this if one is aware of it. The unsigned overflow check is only meant to be used to flag instances of such wrap-around where it was not intentional. You shouldn't use it on other's code that might rely on it or on your own code if you might be relying on it.
In -fsanitize=address,undefined,nullability,implicit-integer-truncation,implicit-integer-arithmetic-value-change,implicit-conversion,integer
all except address
and undefined
enable UBsan checks which are not flagging actual undefined behavior, but conditions that may be unintentional in many cases. The default -fsanitize=undefined
sanitizer flag doesn't enable the unsigned integer overflow check by default for the reasons given above. See https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html for details.
The result of using uint8_t
in a std::uniform_int_distribution
is undefined, so:
std::uniform_int_distribution<uint8_t> dd(0, 255); // Don't do this!
You can use any of short
, int
, long
, long long
, unsigned short
, unsigned int
, unsigned long
, or unsigned long long
instead.
Quote from rand.req.gen
/1.5
Throughout this subclause [rand], the effect of instantiating a template:
that has a template type parameter namedIntType
is undefined unless the corresponding template argument is cv-unqualified and is one ofshort
,int
,long
,long long
,unsigned short
,unsigned int
,unsigned long
, orunsigned long long
.
If that doesn't help, skip the -fsanitize=integer
option since
-fsanitize=integer
: Checks for undefined or suspicious integer behavior (e.g. unsigned integer overflow). Enablessigned-integer-overflow
... and unsigned integer overflow does not have undefined behavior. The check for signed integer overflow will be automatically enabled by using -fsanitize=undefined
so you don't have to enable that separately.
If that still doesn't help, it could be a bug in the gcc library implementation used by clang++
that causes this. You can try using clang++
's library implementation to see if that helps:
clang++ -stdlib=libc++ ...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With