In this paper is the following example of a piece of code that can trigger a division-by-zero:
if (arg2 == 0)
ereport(ERROR, (errcode(ERRCODE_DIVISION_BY_ZERO),
errmsg("division by zero")));
/* No overflow is possible */
PG_RETURN_INT32((int32) arg1 / arg2);
ereport
here is a macro that expands to a call to a bool
-returning function errstart
that may or may not return and, conditional (using a ?:
) on its return value, a call to another function. In this case, I believe ereport
with level ERROR
unconditionally causes a longjmp()
someplace else.
Consequently, a naive interpretation of the above code is that, if arg2
is nonzero, the division will happen and the result will be returned, while, if arg2
is zero, an error will be reported and the division will not happen. However, the linked paper claims that a C compiler may legitimately hoist the division before the zero check, then infer that the zero check is never triggered. Their only reasoning, which seems incorrect to me, is that
[T]he programmer failed to inform the compiler that the call to ereport(ERROR, : : :) does not return. This implies that the division will always execute.
John Regehr has a simpler example:
void bar (void);
int a;
void foo3 (unsigned y, unsigned z)
{
bar();
a = y%z;
}
According to this blog post, clang hoists the modulo operation above the call to bar
, and he shows some assembly code to prove it.
My understanding of C as it applies to these snippets was that
Functions that do not, or may not, return are well-formed in standard C, and declarations of such require no particular attributes, bells, or whistles.
The semantics of a call to a function that do not, or may not, return are well-defined, in particular by 6.5.2.2 "Function calls" in C99.
Since the ereport
invocation is a full expression, there is a sequence point at the ;
. Similarly, since the bar
call in John Regehr's code is a full expression, there is a sequence point at the ;
.
There is consequently a sequence point between the ereport
invocation or bar
call
and the division or modulo.
C compilers may not introduce undefined behaviour to programs that do not elicit undefined behaviour on their own.
These five points seem to be enough to conclude that the above division-by-zero test is correctly-written and that hoisting the modulo above the call to bar
is incorrect. Two compilers and a host of experts disagree. What is wrong with my reasoning?
The paper is wrong, and as for the clang example, this is a compiler bug (a rather common occurrence with clang...). I wish I could give you better reasons, but you already provided all the correct reasoning in the question.
Actually, for the clang issue, as far as I can tell, no bug has been demonstrated yet. Since bar
does return in the example on the blog you linked to, the compiler is free to reorder the division across the call. This is trivial to do if bar
is defined in the same translation unit, but it's also possible with LTO. To actually test for this bug, you need a function bar
that never returns.
Using the "as if" rule...
The division can be done wherever the compiler feels like it; as long as the resulting code behaves as if the division was done in the correct place.
This means that:
a) if division by zero (or division resulting in an overflow) causes an exception (e.g. typical for integer division on 80x86) then the division can't be done before the function call (unless the compiler can prove that the division is always safe).
b) if division by zero (or division resulting in an overflow) does not cause an exception (e.g. typical for floating point on 80x86) the compiler may do the division before the function call; as long as nothing in the function being called can modify the values used in the division.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With