Division by zero and undefined behaviour in C

Question

In this paper is the following example of a piece of code that can trigger a division-by-zero:

if (arg2 == 0)
    ereport(ERROR, (errcode(ERRCODE_DIVISION_BY_ZERO),
                    errmsg("division by zero")));
/* No overflow is possible */
PG_RETURN_INT32((int32) arg1 / arg2);

ereport here is a macro that expands to a call to a bool-returning function errstart that may or may not return and, conditional (using a ?:) on its return value, a call to another function. In this case, I believe ereport with level ERROR unconditionally causes a longjmp() someplace else.

Consequently, a naive interpretation of the above code is that, if arg2 is nonzero, the division will happen and the result will be returned, while, if arg2 is zero, an error will be reported and the division will not happen. However, the linked paper claims that a C compiler may legitimately hoist the division before the zero check, then infer that the zero check is never triggered. Their only reasoning, which seems incorrect to me, is that

[T]he programmer failed to inform the compiler that the call to ereport(ERROR, : : :) does not return. This implies that the division will always execute.

John Regehr has a simpler example:

void bar (void);
int a;
void foo3 (unsigned y, unsigned z)
{
  bar();
  a = y%z;
}

According to this blog post, clang hoists the modulo operation above the call to bar, and he shows some assembly code to prove it.

My understanding of C as it applies to these snippets was that

Functions that do not, or may not, return are well-formed in standard C, and declarations of such require no particular attributes, bells, or whistles.
The semantics of a call to a function that do not, or may not, return are well-defined, in particular by 6.5.2.2 "Function calls" in C99.
Since the ereport invocation is a full expression, there is a sequence point at the ;. Similarly, since the bar call in John Regehr's code is a full expression, there is a sequence point at the ;.
There is consequently a sequence point between the ereport invocation or bar call and the division or modulo.
C compilers may not introduce undefined behaviour to programs that do not elicit undefined behaviour on their own.

These five points seem to be enough to conclude that the above division-by-zero test is correctly-written and that hoisting the modulo above the call to bar is incorrect. Two compilers and a host of experts disagree. What is wrong with my reasoning?

R.. GitHub STOP HELPING ICE · Accepted Answer

The paper is wrong, and as for the clang example, this is a compiler bug (a rather common occurrence with clang...). I wish I could give you better reasons, but you already provided all the correct reasoning in the question.

Actually, for the clang issue, as far as I can tell, no bug has been demonstrated yet. Since bar does return in the example on the blog you linked to, the compiler is free to reorder the division across the call. This is trivial to do if bar is defined in the same translation unit, but it's also possible with LTO. To actually test for this bug, you need a function bar that never returns.

Brendan · Answer

Using the "as if" rule...

The division can be done wherever the compiler feels like it; as long as the resulting code behaves as if the division was done in the correct place.

This means that:

a) if division by zero (or division resulting in an overflow) causes an exception (e.g. typical for integer division on 80x86) then the division can't be done before the function call (unless the compiler can prove that the division is always safe).

b) if division by zero (or division resulting in an overflow) does not cause an exception (e.g. typical for floating point on 80x86) the compiler may do the division before the function call; as long as nothing in the function being called can modify the values used in the division.

Division by zero and undefined behaviour in C

Tags:

c

language-lawyer

undefined-behavior

tmyklebu

2 Answers

R.. GitHub STOP HELPING ICE

Brendan

Recent Activity

Donate For Us

Division by zero and undefined behaviour in C

Tags:

c

language-lawyer

undefined-behavior

tmyklebu

2 Answers

R.. GitHub STOP HELPING ICE

Brendan

Related questions

Recent Activity

Donate For Us