Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are compiler optimizations safe?

I recently discovered at work that it is the policy not to use compiler optimizations for hard real time embedded systems because of the risk of compiler bugs (we mainly use gcc but the policy extends to other compilers as well). Apparently this policy started because someone was burnt in the past by a bug with an optimizer. My gut feeling is that this is being overly paranoid so I've started looking for data on this issue but the problem is I can't find any hard data on this.

Does anyone know of a way to actually get this type of data? Can the gcc bugzilla page be used to generate some statistics of bugs vs compiler optimization level? Is it even possible to get unbiased data like this?

like image 747
nightrain Avatar asked Jan 30 '12 04:01

nightrain


4 Answers

I don't have any data (and haven't heard of anyone that does ...) but ...

I'd choose which compiler I would use before I'd choose to disable optimizations. In other words, I wouldn't use any compiler I couldn't trust the optimizations on.

The linux kernel is compiled with -Os. That's a lot more convincing to me than any bugzilla analysis.

Personally, I'd be okay with any version of gcc linux is okay with.

As another data point, Apple's been converting from gcc to llvm, with and without clang. llvm has traditionally had issues with some C++ and while llvm-gcc is now a lot better, there still seem to be issues with clang++. But that just kind of proves the pattern: while Apple (purportedly) now compiles OS X and iOS with clang, they don't use much if any C++ and Objective C++. So for pure C and Objective C, I'd trust clang, but I still don't yet trust clang++.

like image 131
smparkes Avatar answered Nov 13 '22 20:11

smparkes


Is using a compiler safe ?

A compiler, by design, transforms your code into another form. It normally should transform it correctly, but as all software there may be a bug lurking there. So no it is not safe.

What can make code safe ?

Testing/Usage.

For bugs to manifest, the code that contain them must be run in a particular configuration. For any non-trivial piece of software, it is nigh impossible to prove the absence of bugs, however heavy testing and heavy usage tend to at least clear some paths of execution.

So, how can I be safe ?

Well, by using the same paths that every one else does. This gives you the best chance that the path is bug-free, with all the people that already went through there.

For gcc then ? I would use -O2 or -Os (like Linux does) because those are likely to have received a tremendous amount of scrutiny, either direct or indirect.

Should you turn on optimizations ?

However, introducing optimizations into a tool-chain is disruptive. It requires more than just switching the flip. You need to perform heavy testing to make sure that in your conditions nothing bad happens.

More specifically, compilers rely on undefined behavior to perform a number of optimizations. If your code has never been exposed to optimizations, then it is very likely to rely on such undefined behavior here and there, and turning optimizations may expose those bugs (not introduce them).

It's no more disruptive than switching compilers though.

like image 32
Matthieu M. Avatar answered Nov 13 '22 20:11

Matthieu M.


You are assuming the compiler is bug free without optimizations and only the optimizations are dangerous. the compilers themselves are programs and very often have bugs with or without using certain features. Sure the features might make it better or they might make it worse.

Llvm was mentioned in another answer, there is a well know llvm optimization bug that they appear to have zero interest in fixing

while(1) continue;

gets optimized out, just goes away...sometimes...and other similar but not completely infinite loops also disappear in the llvm optimizer. Leaving you with a binary that doesnt match your source code. This is one I know there are probably many more in both gcc and llvm's compilers.

gcc is a monster that is barely held together with duct tape and bailing wire. It is like watching one of those faces of death movies or something like that once you have had those images in your head one time, you cant unwatch them, they are burned in there for life. So it is worth finding out for yourself how scary gcc is, by looking behind the curtain. But you might not be able to forget what you had seen. For various targets -O0 -O1 -O2 -O3 can and have all failed miserably with some code at some point in time. Likewise the fix sometime is to optimize more not less.

When you write a program the hope is the compiler does what it says it does, just like you hope your program does what you say it does. But it is not always the case, your debugging does not end when the source code is perfect, it ends when the binary is perfect, and that includes whatever binary and operating system you hope to target (different minor versions of gcc make different binaries, different linux targets react differently to programs).

The most important advise is develop and test using the target optimization level. if you develop and test by always building for a debugger, well you have a created a program that works in a debugger, you get to start over when you want to make it work somewhere else. gcc's -O3 does work often but folks are afraid of it and it doesnt get enough usage to be debugged properly, so it is not as reliable. -O2 and no optimization -O0 get a lot of mileage, lots of bug reports, lots of fixes, choose one of those or as another answer said, go with what Linux uses. Or go with what firefox uses or go with what chrome uses.

Now hard realtime embedded systems. Man mission systems, systems where life or property are directly affected. First why are you using gcc? Second, yes, optimizers are often NOT used in these environments, it creates too much risk and/or greatly increases the testing and validation effort. Normally you want to use a compiler that has been through a lot of testing itself and its warts and traps are well known. Do you want to be the person who turned on the optimizer, and as a result the flight computer crashed the airplane into an elementary school on a school day? There is a lot to be learned from the old timers. yes they have a lot of war stories, and a lot of fear of new fangled things. dont repeat history, learn from it. "They dont build em like they used to" means something it is not just a saying. those legacy systems were stable and reliable and still running for a reason, partly those old timers and what they learned the hard way, and partly because newer stuff is built cheaper and with lower quality components.

For this class of environment you definitely dont stop at the source code, your money and time is poured into validating the BINARY. Each time you change the binary you need to start validation over again. No different than the hardware it runs on, you change one component you, warm up one solder joint, you start validation testing over again from the beginning. One difference perhaps is that in some of these environments each solder joint is only allowed a maximum number of cycles before you scrap the whole unit. But it can be the case in software, only so many burn cycles on the prom before you scrap the prom and only so many rework cycles on the prom pads/holes before you scrap the board/unit. Leave the optimizer off and find a better, more stable, compiler and/or programming language.

Now if this hard real time environment is not going to hurt people or property (other than what it runs on) when it crashes, then that is another story. Maybe its a blue ray player and it skips a frame here and there or displays a few bad pixels, big deal. Turn the optimizer on, the masses dont care about that level of quality anymore, they are content with youtube quality images, compressed video formats, etc. Cars that have to be turned off and on again for the radio or bluetooth to work. Doesnt bother them one bit, turn the optimizer on and claim performance gain over your competitor. If the software is too buggy to tolerate the customers will work around it or just buy someone elses, when that one fails they will come back to you and buy your new model with the newer firmware. They will continue to do this because they want to dancing baloney, they dont want stability nor quality. That stuff costs too much.

You should collect your own data, try the optimizers on the software in your environment and run the product through a full validation suite. If it doesnt break then either the optimizer for that code that day is okay or the test system needs more work. If you cannot do that then you can at least disassemble and analyze what the compiler is doing with your code. I would assume (and know from personal experience) that both gcc and llvm bug systems do have bugs that are tied to optimization levels, does that mean you can sort them based on optimization level? Dont know, these are open source, largely uncontrolled interfaces, so you cant rely on the masses to accurately and completely define the input fields, if there were an optimization field on the bug report form, it is probably always set to the default for the form/web page. You have to examine the problem report itself to see if the user had problems related to the optimizer. If this were a closed system for a corporation where an employees performance review might be negatively reflected for not following procedure like filling out forms correctly, you would have better searchable databases to draw information from.

The optimizer does increase your risk. Lets say 50% of the compiler is used to get an output with no optimization, another 10% to get -O1, you have increased your risk, more compiler code used, more risk of having a bug, more risk in the output being bad. and more code is used to get to -O2 and -O3. Reducing optimization doesnt eliminate the risk completely but does reduce the odds.

like image 35
old_timer Avatar answered Nov 13 '22 18:11

old_timer


In embedded systems it is frequently necessary to control the hardware by writing to registers. In C this is very easy, just initialise a pointer with the register address and away you go.

If no other part of the program reads or writes the register it is quite likely that the optimiser will remove the assignment. Breaking the code.

This particular problem can be fixed using the "volatile" keyword. But don't forget that the optimiser also changes the sequence of commands. So if your hardware expects registers to be written in a certain order you may be burned.

An optimiser is supposed to produce correct results, but the interim steps may change, this is where you can be hurt by an optimiser.

like image 30
sgbirch Avatar answered Nov 13 '22 19:11

sgbirch