Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Testing a Code-Generator Optimization

I have written a low-level optimization for the LLVM code-generator backend. Basically, the optimization will reorder assembly instructions at the basic block level to allow a later (existing) optimization to more efficiently optimize the resultant code. There are a number of test cases I'd like to verify, and I'd like some suggestions for the testing process, as this is the first time I've attempted something like this.

Things I've considered so far:

  1. Compile benchmarks written in C and examine the resulting ASM generated using the -S option. I have done this, and compared the results with my optimization to the original results. This method allows me to see that my optimization works, but even if I write custom non-executable C files I will not be able to examine all of my desired instruction ordering test cases.

  2. Compile benchmarks to LLVM assembly, edit that, then lower the ASM down to the target machine assembly. This may work, but because of the different level of abstraction between LLVM and target ASM, I doubt that I'd be able to examine all the test cases by hacking at the LLVM ASM until it generates what I want it to.

  3. Use the target ASM test cases as input to LLVM and recompile using the new optimization. I was unable to find an option for either LLVM or gcc (most of whose options LLVM accepts) to accept ASM as an input.

What is a good strategy for testing specific ASM test cases when validating a low-level ASM compiler optimization? Does LLVM (or gcc) have some command line options that would make this process easier?


Edit: To clarify, I'm not asking about automatically generating ASM test cases; my problem is that I have those test cases (e.g., ASM_before.s and reference_ASM_after.s) but I need to be able to pass ASM_before.s into LLVM and ensure that the optimized output ASM_after.s matches known good reference_ASM_after.s. I'm looking for a way to do this without having to "decompile" ASM_before.s into a high-level language and then compile it (with the optimization) down to ASM_after.s.

like image 684
Zeke Avatar asked May 21 '11 19:05

Zeke


People also ask

What is code generation and code optimization?

Optimization is the final stage of compiler, though it is optional. This is a program transformation technique, which tries to improve the code by making it consume less resources (i.e. CPU, Memory) and deliver high speed.


1 Answers

Benchmarking is one of those slippery slopes, you can come up with a benchmark to make any language or tool look good or bad depending on what you are trying to prove.

first off I normally work on arm platforms with no operating system so it is quite simple to time the execution, sometimes down to the clock, plus or minus one to compare compilers or options.

Particularly when you get into platforms with a cache things just get worse. If you add or remove nops from your startup code, causing the whole program to change its location in memory meaning everything changes its cache alignment, without any compiler optimization changes you can sometimes find more performance differences due to the cache than differences in compiler or backend optimizations.

I normally run a dhrystone, but dont declare victory or failure with that. You might want to do a whetstone as well if you use float or a whetstone with a soft fpu.

As already mentioned by someone above, self checking tests are a good idea. Real world code too. For example compression routines, take some text (perhaps a portion of a book from project gutenburg), compress it, then decompress it and compare the output to the intput, you could add an extra validation by compressing it on a control platform like your host and hardcode the compressed size into the test if the compressed version under test does not match but it gets the right output it still fails. I have also used the jpeg library to convert images from/to jpeg, if the image is not expected to return to its original state with the lossy compression then you can just do one transfer and checksum or verify the size or carry a copy of the expected output and compare. Aes and des encryption and decryption.

There are volumes of open source projects that you can use with your modified compiler to compare it to the stock compiler or other compilers. Being real world code, it is the kind of thing your compiler will be used with anyway. Note how when you go to toms hardware or other benchmark sites there are many different benchmarks, the time it takes to render something, the time it takes to compile gcc or linux or perform a database search, a bunch of real world applications. And the various applications get various scores, very rare that one platform/solution sweeps the battery of tests.

When your performance drops as you make changes that is the time you examine the assembler and try to figure out why. Remember what Michael Abrash (and others) said, no matter how good you think your assembler is you still have to time it. Also try crazy things that you are sure are going to be slow, because sometimes you find out they are fast for reasons you never thought about.

like image 159
old_timer Avatar answered Oct 01 '22 20:10

old_timer