Can compiler optimization introduce bugs?

Tags:

Today I had a discussion with a friend of mine and we debated for a couple of hours about "compiler optimization".

I defended the point that sometimes, a compiler optimization might introduce bugs or at least, undesired behavior.

My friend totally disagreed, saying that "compilers are built by smart people and do smart things" and thus, can never go wrong.

He didn't convince me at all, but I have to admit I lack of real-life examples to strengthen my point.

Who is right here? If I am, do you have any real-life example where a compiler optimization produced a bug in the resulting software? If I'm mistaking, should I stop programming and learn fishing instead?

658

asked Apr 27 '10 15:04

ereOn

2 Answers

Compiler optimizations can introduce bugs or undesirable behaviour. That's why you can turn them off.

One example: a compiler can optimize the read/write access to a memory location, doing things like eliminating duplicate reads or duplicate writes, or re-ordering certain operations. If the memory location in question is only used by a single thread and is actually memory, that may be ok. But if the memory location is a hardware device IO register, then re-ordering or eliminating writes may be completely wrong. In this situation you normally have to write code knowing that the compiler might "optimize" it, and thus knowing that the naive approach doesn't work.

Update: As Adam Robinson pointed out in a comment, the scenario I describe above is more of a programming error than an optimizer error. But the point I was trying to illustrate is that some programs, which are otherwise correct, combined with some optimizations, which otherwise work properly, can introduce bugs in the program when they are combined together. In some cases the language specification says "You must do things this way because these kinds of optimizations may occur and your program will fail", in which case it's a bug in the code. But sometimes a compiler has a (usually optional) optimization feature that can generate incorrect code because the compiler is trying too hard to optimize the code or can't detect that the optimization is inappropriate. In this case the programmer must know when it is safe to turn on the optimization in question.

Another example: The linux kernel had a bug where a potentially NULL pointer was being dereferenced before a test for that pointer being null. However, in some cases it was possible to map memory to address zero, thus allowing the dereferencing to succeed. The compiler, upon noticing that the pointer was dereferenced, assumed that it couldn't be NULL, then removed the NULL test later and all the code in that branch. This introduced a security vulnerability into the code, as the function would proceed to use an invalid pointer containing attacker-supplied data. For cases where the pointer was legitimately null and the memory wasn't mapped to address zero, the kernel would still OOPS as before. So prior to optimization the code contained one bug; after it contained two, and one of them allowed a local root exploit.

CERT has a presentation called "Dangerous Optimizations and the Loss of Causality" by Robert C. Seacord which lists a lot of optimizations that introduce (or expose) bugs in programs. It discusses the various kinds of optimizations that are possible, from "doing what the hardware does" to "trap all possible undefined behaviour" to "do anything that's not disallowed".

Some examples of code that's perfectly fine until an aggressively-optimizing compiler gets its hands on it:

Checking for overflow

// fails because the overflow test gets removed if (ptr + len < ptr || ptr + len > max) return EINVAL;

Using overflow artithmetic at all:

// The compiler optimizes this to an infinite loop for (i = 1; i > 0; i += i) ++j;

Clearing memory of sensitive information:

// the compiler can remove these "useless writes" memset(password_buffer, 0, sizeof(password_buffer));

The problem here is that compilers have, for decades, been less aggressive in optimization, and so generations of C programmers learn and understand things like fixed-size twos complement addition and how it overflows. Then the C language standard is amended by compiler developers, and the subtle rules change, despite the hardware not changing. The C language spec is a contract between the developers and compilers, but the terms of the agreement are subject to change over time and not everyone understands every detail, or agrees that the details are even sensible.

This is why most compilers offer flags to turn off (or turn on) optimizations. Is your program written with the understanding that integers might overflow? Then you should turn off overflow optimizations, because they can introduce bugs. Does your program strictly avoid aliasing pointers? Then you can turn on the optimizations that assume pointers are never aliased. Does your program try to clear memory to avoid leaking information? Oh, in that case you're out of luck: you either need to turn off dead-code-removal or you need to know, ahead of time, that your compiler is going to eliminate your "dead" code, and use some work-around for it.

182

answered Oct 02 '22 16:10

Mr. Shiny and New 安宇

When a bug goes away by disabling optimizations, most of the time it's still your fault

I am responsible for a commercial app, written mostly in C++ - started with VC5, ported to VC6 early, now successfully ported to VC2008. It grew to over 1 Million lines in the last 10 years.

In that time I could confirm a single code generation bug thast occured when agressive optimizations where enabled.

So why am I complaining? Because in the same time, there were dozens of bugs that made me doubt the compiler - but it turned out to be my insufficient understanding of the C++ standard. The standard makes room for optimizations the compiler may or may not make use of.

Over the years on different forums, I've seen many posts blaming the compiler, ultimately turning out to be bugs in the original code. No doubt many of them obscure bugs that need a detailed understanding of concepts used in the standard, but source code bugs nonetheless.

Why I reply so late: stop blaming the compiler before you have confirmed it's actually the compiler's fault.

answered Oct 02 '22 15:10

peterchen

Related questions
                            
                                JS: Most optimized way to remove a filename from a path in a string?
                            
                                What does the class class [B represents in Java?
                            
                                Is shifting bits faster than multiplying and dividing in Java? .NET? [closed]
                            
                                Rules for using the restrict keyword in C?
                            
                                Android ProGuard: Most Aggressive Optimizations
                            
                                Using index, using temporary, using filesort - how to fix this?
                            
                                Optimization by Java Compiler
                            
                                Will my iPhone app take a performance hit if I use Objective-C for low level code?
                            
                                C++: Mysteriously huge speedup from keeping one operand in a register
                            
                                Why does using the ternary operator to return a string generate considerably different code from returning in an equivalent if/else block?
                            
                                What is Haskell's Stream Fusion
                            
                                Most efficient way to see if an ArrayList contains an object in Java
                            
                                Is it legal for a C++ optimizer to reorder calls to clock()?
                            
                                String concatenation without '+' operator
                            
                                Creating an index on a timestamp to optimize query
                            
                                Why prefer signed over unsigned in C++? [closed]
                            
                                Generating permutations of a set (most efficiently)
                            
                                Is the compiler allowed to optimize out heap memory allocations?
                            
                                Is my Android App Draining Battery?
                            
                                Does the <script> tag position in HTML affects performance of the webpage?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Can compiler optimization introduce bugs?

Tags:

optimization

compiler-optimization

compiler-construction

ereOn

People also ask

2 Answers

Mr. Shiny and New 安宇

peterchen

Recent Activity

Donate For Us