I know what UB is, so I'm not asking how to avoid it, but whether there's a way to make unit testing more resistent to it, even if it's a probabilistic approach, that just makes UB more likely to become apparent rather than silently passing tests successfully.
Let's say I want to write a test for a function and I that I do it wrong, like this:
#include <gtest/gtest.h>
#include <vector>
int main()
{
std::vector<int> v{0};
for (auto i = 0; i != 100; ++i) {
v.push_back(3); // push a 3
v.pop_back(); // ops, popping the value I just pushed
EXPECT_EQ(v[1], 3); // UB
}
}
On my machine, it consistently passes; maybe the program is so simple that there's no reason for the 3 to be truly wiped away from the area of memory where it lives before pop_back
.
Therefore the test clearly isn't reliable.
Is there any way to protect against such accidentally succesful tests, even on a statistical ground ("calling such a function before the EXPECT_EQ
you decrease the chances that UB will sting you")?
The code above is just an example (I'm not willing to test the STL); I know of std::vector<T>::at
as a bound-safe std::vector<T>::operator[]
, but that's a way to prevent undefined behavior in the first place, whereas I'm wandering about how to defend against it.
For instance, leveraging UB itself by adding *(&v[0] + 1) = 10;
right after v.pop_back();
, will make the incorrectness of the test apparent, at least on my machine.
So I'm kind of thinking of a tool/library/whatever which would, let's say, set the memory not hold by v
to random values after every executable line.
Clang with Adress Sanitizer (https://clang.llvm.org/docs/AddressSanitizer.html) catches this error:
$ clang++ -Wall -std=c++11 -o test test.cpp
$ ./test # program runs without errors
$ clang++ -fsanitize=address -Wall -std=c++11 -o test test.cpp
$ ./test
=================================================================
==94146==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000000f4 at pc 0x00010ebcbf54 bp 0x7ffee10362d0 sp 0x7ffee10362c8
READ of size 4 at 0x6020000000f4 thread T0
#0 0x10ebcbf53 in main+0x393 (test:x86_64+0x100002f53)
#1 0x7fff204c3f3c in start+0x0 (libdyld.dylib:x86_64+0x15f3c)
0x6020000000f4 is located 4 bytes inside of 8-byte region [0x6020000000f0,0x6020000000f8)
allocated by thread T0 here:
#0 0x10ec38c9d in wrap__Znwm+0x7d (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x54c9d)
#1 0x10ebcdb38 in std::__1::__libcpp_allocate(unsigned long, unsigned long)+0x18 (test:x86_64+0x100004b38)
#2 0x10ebcdaa9 in std::__1::allocator<int>::allocate(unsigned long)+0x49 (test:x86_64+0x100004aa9)
#3 0x10ebcd4cc in std::__1::allocator_traits<std::__1::allocator<int> >::allocate(std::__1::allocator<int>&, unsigned long)+0x1c (test:x86_64+0x1000044cc)
#4 0x10ebcfbc0 in std::__1::__split_buffer<int, std::__1::allocator<int>&>::__split_buffer(unsigned long, unsigned long, std::__1::allocator<int>&)+0x180 (test:x86_64+0x100006bc0)
#5 0x10ebcf68c in std::__1::__split_buffer<int, std::__1::allocator<int>&>::__split_buffer(unsigned long, unsigned long, std::__1::allocator<int>&)+0x2c (test:x86_64+0x10000668c)
#6 0x10ebceec4 in void std::__1::vector<int, std::__1::allocator<int> >::__push_back_slow_path<int>(int&&)+0x154 (test:x86_64+0x100005ec4)
#7 0x10ebcc480 in std::__1::vector<int, std::__1::allocator<int> >::push_back(int&&)+0xd0 (test:x86_64+0x100003480)
#8 0x10ebcbedd in main+0x31d (test:x86_64+0x100002edd)
#9 0x7fff204c3f3c in start+0x0 (libdyld.dylib:x86_64+0x15f3c)
SUMMARY: AddressSanitizer: heap-buffer-overflow (test:x86_64+0x100002f53) in main+0x393
Shadow bytes around the buggy address:
0x1c03ffffffc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1c03ffffffd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1c03ffffffe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1c03fffffff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x1c0400000000: fa fa fd fd fa fa 00 00 fa fa 00 06 fa fa 00 fa
=>0x1c0400000010: fa fa 00 00 fa fa 00 06 fa fa fd fa fa fa[04]fa
0x1c0400000020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x1c0400000030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x1c0400000040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x1c0400000050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x1c0400000060: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==94146==ABORTING
[1] 94146 abort ./test
/tmp
Checking for invalid memory accesses is unfortunately not good enough as pop_back()
is not required to relinquish the memory.
v[1]
is always undefined behavior by virtue of reading from a deleted object, but this is a subtlety that only exists during compilation from the perspective of the c++ abstract machine. Once the code has been compiled to binary, as long as the memory is allocated and properly aligned, then there is no "problem". Because of this, you will not necessarily catch such UB with system-level runtime checks.
While this is not a silver bullet for UB in general, there are some preprocessor macros you can define to enable additional validation within the standard library.
stdlib | macro |
---|---|
libstdc++ | _GLIBCXX_DEBUG |
libc++ | _LIBCPP_DEBUG |
MSVC | automatic for Debug builds, but partial :( |
So adding -D_GLIBCXX_DEBUG -D_LIBCPP_DEBUG
to the compiler flags will reliably catch OP's error, at least when using gcc/clang.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With