Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

No useful and reliable way to detect integer overflow in C/C++?

No, this is not a duplicate of How to detect integer overflow?. The issue is the same but the question is different.


The gcc compiler can optimize away an overflow check (with -O2), for example:

int a, b;
b = abs(a);                     // will overflow if a = 0x80000000
if (b < 0) printf("overflow");  // optimized away

The gcc people argue that this is not a bug. Overflow is undefined behavior, according to the C standard, which allows the compiler to do anything. Apparently, anything includes assuming that overflow never happens. Unfortunately, this allows the compiler to optimize away the overflow check.

The safe way to check for overflow is described in a recent CERT paper. This paper recommends doing something like this before adding two integers:

if ( ((si1^si2) | (((si1^(~(si1^si2) & INT_MIN)) + si2)^si2)) >= 0) { 
  /* handle error condition */
} else {
  sum = si1 + si2;
}

Apparently, you have to do something like this before every +, -, *, / and other operations in a series of calculations when you want to be sure that the result is valid. For example if you want to make sure an array index is not out of bounds. This is so cumbersome that practically nobody is doing it. At least I have never seen a C/C++ program that does this systematically.

Now, this is a fundamental problem:

  • Checking an array index before accessing the array is useful, but not reliable.

  • Checking every operation in the series of calculations with the CERT method is reliable but not useful.

  • Conclusion: There is no useful and reliable way of checking for overflow in C/C++!

I refuse to believe that this was intended when the standard was written.

I know that there are certain command line options that can fix the problem, but this doesn't alter the fact that we have a fundamental problem with the standard or the current interpretation of it.

Now my question is: Are the gcc people taking the interpretation of "undefined behavior" too far when it allows them to optimize away an overflow check, or is the C/C++ standard broken?

Added note: Sorry, you may have misunderstood my question. I am not asking how to work around the problem - that has already been answered elsewhere. I am asking a more fundamental question about the C standard. If there is no useful and reliable way of checking for overflow then the language itself is dubious. For example, if I make a safe array class with bounds checking then I should be safe, but I'm not if the bounds checking can be optimized away.

If the standard allows this to happen then either the standard needs revision or the interpretation of the standard needs revision.

Added note 2: People here seem unwilling to discuss the dubious concept of "undefined behavior". The fact that the C99 standard lists 191 different kinds of undefined behavior (link) is an indication of a sloppy standard.

Many programmers readily accept the statement that "undefined behavior" gives the license to do anything, including formatting your hard disk. I think it is a problem that the standard puts integer overflow into the same dangerous category as writing outside array bounds.

Why are these two kinds of "undefined behavior" different? Because:

  • Many programs rely on integer overflow being benign, but few programs rely on writing outside array bounds when you don't know what is there.

  • Writing outside array bounds actually can do something as bad as formatting your hard disk (at least in an unprotected OS like DOS), and most programmers know that this is dangerous.

  • When you put integer overflow into the dangerous "anything goes" category, it allows the compiler to do anything, including lying about what it is doing (in the case where an overflow check is optimized away)

  • An error such as writing outside array bounds can be found with a debugger, but the error of optimizing away an overflow check cannot, because optimization is usually off when debugging.

  • The gcc compiler evidently refrains from the "anything goes" policy in case of integer overflow. There are many cases where it refrains from optimizing e.g. a loop unless it can verify that overflow is impossible. For some reason, the gcc people have recognized that we would have too many errors if they followed the "anything goes" policy here, but they have a different attitude to the problem of optimizing away an overflow check.

Maybe this is not the right place to discuss such philosophical questions. At least, most answers here are off the point. Is there a better place to discuss this?

like image 443
A Fog Avatar asked Jul 28 '11 08:07

A Fog


People also ask

Does C detect overflow?

Detecting Overflow and Underflow in CIf both numbers are positive and the sum is negative, that means there is an overflow, so we return -1 else; if both numbers are negative and the sum is positive, that also means there is an overflow, so we return -1 else, no overflow.

How can integer overflows be avoided?

Avoidance. By allocating variables with data types that are large enough to contain all values that may possibly be computed and stored in them, it is always possible to avoid overflow.

How does C handle integer overflow?

An integer overflow occurs when you attempt to store inside an integer variable a value that is larger than the maximum value the variable can hold. The C standard defines this situation as undefined behavior (meaning that anything might happen).

What are the means used to detect and prevent integer overflow?

Following are the three main techniques for detecting unintended integer overflow: Precondition testing. Check the inputs to each arithmetic operator to ensure that overflow cannot occur. Throw an ArithmeticException when the operation would overflow if it were performed; otherwise, perform the operation.


2 Answers

Ask yourself: how often do you actually need checked arithmetic? If you need it often you should write a checked_int class that overloads the common operators and encapsulate the checks into this class. Props for sharing the implementation on an Open Source website.

Better yet (arguably), use a big_integer class so that overflows can’t happen in the first place.

like image 72
Konrad Rudolph Avatar answered Dec 08 '22 05:12

Konrad Rudolph


The gcc developers are entirely correct here. When the standard says that the behavior is undefined that means exactly that there are no requirements on the compiler.

As a valid program can not do anything that causes UB (as then it would not be valid anymore), the compiler can very well assume that UB doesn't happen. And if it still does, anything the compiler does would be ok.

For your problem with overflow, one solution is to consider what ranges the caclulations are supposed to handle. For example, when balancing my bank account I can assume that the amounts would be well below 1 billion, so a 32-bit int will work.

For your application domain you can probably do similar estimates about exactly where an overflow could be possible. Then you can add checks at those points or choose another data type, if available.

like image 42
Bo Persson Avatar answered Dec 08 '22 03:12

Bo Persson