Compiler code generation - how we know it is high quality?

Question

It is common mantra nowadays: "C/C++ compilers generate better code than hand written assembly." or "Compilers generate code that is as good and often better than could be written by hand."

But how we know this is true? Are there some valuable studies about HLL compiler code quality? I would like to read some works on this subject and not only for C/C++ but for other languages as well.

Thanks

EDIT: I am asking not for discussion on this subject, nor for personal opinions or thoughts. I am asking about references to studies about the subjects. Such studies definitely must contains some experimental or theoretical work on the subject that can be verified.

Please, if you don't have such information, simply don't answer this question. I already know all your thoughts on this subject.

Alexey Frunze · Accepted Answer

Today's C/C++ compilers are much better than they were 15 or more years ago as they can now consume more memory and CPU cycles (simply because we now have more of them available) while optimizing code increasingly more aggressively.

In contrast, programmers have hardly grown a second brain in their skull in the past 15 years and their optimization abilities likely remain at about the same level now as they were 15 or even 25 years ago.

At the same time CPUs have become more complex and catering to their various caches, prediction mechanisms, bigger register sets, speculative and parallel execution, longer pipelines, resource contention, etc etc has become harder as well. Taking care of all that is mentally taxing and scales poorly while our software and problems we're solving with it never stop growing in size, number and complexity. And then the new versions of the CPUs often times necessitate not only learning new tricks but also unlearning old ones.

Also you're not very productive writing assembly code, especially when you need to write a lot of it. And it's harder to maintain and change assembly code. For economical reasons you may not always have the option of spending lots of money and man-hours to produce high quality optimized assembly code when the compiler can do a reasonably good job quickly, freeing time for testing and speeding up turnaround.

If you take into account just this, if you have been in the industry long enough, then you don't need special studies to see that on the large scale optimizing compilers outperform crafting optimized assembly code.

And then one should remember that assembly can only give you a roughly linear increase in performance, maybe 3-5x of what the compiler can do in tough cases, whereas choosing a more scalable algorithm can give you a much better boost. So, it may be much prudent to invest into scalable algorithms and parallel/distributed systems for those than into finding or training assembler programmers and paying them lots of money for the rare skill.

Speaking of the rare skill... As people increasingly move to less primitive (or should I say less low-level?) languages than C, C++ and assembly, you become less likely to find programmers who can shine in these low-level languages and beat compilers. They still remain and there will always be some, but you shouldn't count on them on a large scale, which leaves you pretty much only with programmers who cannot beat the compiler.

You may count this as a study. :)

horsh · Answer

The mantra you are questioning has actually been introduced by Backus et al themselves in the very description of Fortran.

The FORTRAN Automaic Coding System (1957)

[The programmer] estimated that it might have taken three days to code this job by hand, plus an unknown time to debug it, and that no appreciable increase in speed of execution would have been achieved thereby.

From the modern point of view, the problem with your question is not in evaluating a code produced by a compiler, but a code produced by a human. You just can rarely present a completely hand-written assembly code for a sufficiently large program.

Nevertheless, in the situation where human beings only need to write a limited amount of code, such a comparison is possible. Consider for example:

Man vs. Machine : Comparing Handwritten and Compiler-generated Application-Level Checkpointing

where only some pieces of code were generated by human and by compiler. Or

Development of an Optimizing Compiler for a Fujitsu Fixed-Point Digital Signal Processor

where the code is generated for a DSP. And where hand-written code is good at sizes of tens or hundreds lines of C code, and a program of 800 lines of C code is considered large.

Besides, there is a known issue of Sufficiently Smart Compiler. Where, while in theory all the needed algorithms are well known, in practive, due to multiple reasons, compilers or compiler developers fail to apply them. A typical example of this problem is analyzed here:

An Evaluation of Vectorizing Compilers

One well known example where compilers do an exceptionally bad job is in the heart of an interpreter loop.

Author of LuaJIT explains why compilers can't beat hand-coded assembly

At some point the discussion has moved to the next stage: weather an automatically generated code generator produces as good code as a hand-written code generator.

Evaluation of Automatically-Generated Compilers
Automatically Generating the Back End of a Compiler Using Declarative Machine Descriptions

Compiler code generation - how we know it is high quality?

Tags:

compiler-optimization

assembly

johnfound

2 Answers

Alexey Frunze

horsh

Recent Activity

Donate For Us

Compiler code generation - how we know it is high quality?

Tags:

compiler-optimization

assembly

johnfound

2 Answers

Alexey Frunze

horsh

Related questions

Recent Activity

Donate For Us