I'm currently in the process of choosing a project for a grad-level compiler course to be done over the next 8 weeks. I'd like to do something related to optimization since I haven't worked much in that area before, but anything in the field is fair game.
What was the most interesting compiler-related project you've done? What did you learn the most from?
Edit: Thank you all for your great suggestions. I apologize for not updating this for so long.
The project I ended up doing was a simple autovectorization optimization on LLVM. LLVM has vector types, but there didn't seem to be any way to take advantage of them without support for the front-end. This optimization converted normal scalar code into vector code.
Since auto-vectorization is a fairly difficult optimization to implement, we limited our scope as much as we could. First, in order to expose instruction level parallelism in the code, we looked for one-block loops that matched our criteria, then unrolled them a specific number of times so they would be conveniently vectorizable. We then implemented the packing algorithm laid out in Exploiting Superword Level Parallelism with Multimedia Instruction Sets by Larsen and Amarasinghe.
Even a simplified version of this optimization is pretty complicated. There are a lot of constraints; for instance, you don't want to vectorize a variable that lives out of the loop, since the rest of the program expects it to be scalar. We put in a lot of hours in the last few weeks. The project was a lot of fun though, and we learned a lot.
With an 8-week timeframe, you're going to need to be careful about "scope creep". That is don't be too ambitious, esp. if this project includes other aspects of compiler construction (lexing/parsing), or if you're still learning the tools (debugger, yacc) and intermediate data structures (DAG).
That said, my first suggestion would be to try some Live Variable Analysis. The algorithms are pretty well established, so you'd pretty much just need to code it up specific to your data structures, etc.
This would let you do a limited form of Dead Code Removal. That is, if you detect that a variable is declared but never used, don't allocate space for it. If you detect that a value is set but never read, don't generate the set.
Live Variable Analysis can help with Register Allocation too, so you might be able to tackle that too if there's time, and you should be able to re-use some of what you build for Dead Code Removal.
A few years ago, I designed a DSL and wrote the compiler for a product that my company produced. The DSL used an odd combination of declarative rules, event-driven logic, and compositional inheritance. It was a very fun project, and I learned a lot.
It really piqued my interest in parsers & compilers, so I've tried to keep up with interesting new developments in compiler technology.
With respect to optimization, here's a fun article that I read last year:
http://theory.stanford.edu/~aiken/publications/papers/asplos06.pdf
In this paper, the authors describe a technique for automatic discovery of peephole optimizations (surpassing the performance of several popular C++ compiler back-ends) without an expert having to write a bunch of special-case code. Their technique uses unsupervised learning algorithms to discover high-value peephole replacements.
After reading it, it occurred to me that their approach could be augmented by providing the algorithm with a "machine description" listing all of the instructions (with their primary effects and their side-effects) supported by the target processor architecture. Then, rather than using a brute-force approach to finding equivalent instruction sequences, the solver could find those sequences much more easily.
The machine learning algorithm would still use empirical observations to determine the most efficient sequence of instructions (because cache effects, micro-ops, and pipelining almost demand empirical timing data), but the result-equivalency could be predicted using an algebraic theorem-prover, operating on the machine description.
In the paper, they talk about how their optimizers could only discover peephole replacement sequences of two or three instructions (because, otherwise, the brute-force search would take too long and consume too much memory). Putting a smart solver at the right place in the algorithm could enable it to work with longer replacement sequences.
Anyhoo... let me know when you finish with that project! And don't forget to mention me in your "Acknowledgements" section!! ;-)
If you're interested in optimization, Vectorization of loops using SSE and MMX instruction sets could be interesting.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With