I know that Undefined Behaviour, once it has happened, makes it impossible to think about the code any longer. I am convinced, completely. I even think I should not dig too much into understanding UB: a sane C++ program should not play with UB, Period.
But so as to convince my colleagues and managers about the real danger of it, I try to find a concrete example, with a bug we DO have in the product (about which they think it is not dangerous, at worst it will always crash with an access violation).
My main concern is about calling a virtual member function on dangling pointers to polymorphic class.
When a pointer is deleted, the windows OS will write a few bytes in the header of the heap block, and usually overwrites also the first bytes of the heap block itself. This is its way to keep track of heap blocks, manage them as a linked list... OS stuffs.
Though it's not defined in the C++ standard, polymorphism is implemented using virtual tables, AFAIK. Under windows, the pointer to the virtual table is located in the first bytes of the heap block, given a class that inherits only one base class. (It may be more complex with multi-inheritance, but I will not take this into account. Let's only consider base class A, and several B, C, D inheriting A).
Now let's consider I have a pointer to an A, which was instanciated as a D objects. And that D object has been deleted elsewhere in the code: so the heap block is now a free heap block, and its first bytes has been overwritten, and as a consequence the virtual table pointer is pointing almost at random somewhere in memory, let's say the address 0x01234567
.
When somewhere in the code, we call:
void test(A * pA)
{
# here we do not know that pA is dangling pointer
# that memory address has been deleted by another thread, in another part of the code
pA->SomeVirtualFunction();
}
Am I right telling that:
0x01234567
as if it was a virtual table0x09876543
0x09876543
will be interpreted as valid binary code, and EXECUTED for realI don't want to be exaggerating, so as to convince. So, is what I'm saying is correct, possible, and likely ?
Undefined behavior (often abbreviated UB) is the result of executing code whose behavior is not well defined by the C++ language. In this case, the C++ language doesn't have any rules determining what happens if you use the value of a variable that has not been given a known value.
In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification to which the computer code adheres.
Undefined Behavior in C and C++ So, in C/C++ programming, undefined behavior means when the program fails to compile, or it may execute incorrectly, either crashes or generates incorrect results, or when it may fortuitously do exactly what the programmer intended.
C and C++ have undefined behaviors because it allows compilers to avoid lots of checks. Suppose a set of code with greater performing array need not keep a look at the bounds, which avoid the needs for complex optimization pass to check such conditions outside loops.
Your example is a possibility.
However, the situation is much, much worse.
If someone is attacking users of your application, then the memory will not contain random data. The attacker will try and likely manage to influence what that data will be. Once that happens, the attacker may be able to determine which code will be executed. And once that happens, unless your application is properly sandboxed (which I bet it is not with that attitude of your co-developers), the attacker may be able to take over the user's computer.
And that's not a hypothetical possibility, but something that has happened and will happen again.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With