If you were constructing a compiler, what optimization at the AST level would be the nicest to have?

Mostly you can't do interesting optimizations at the AST level, because you need information how how data flows from one part of the program to another. While data flow is implicit in the meaning of the AST, it isn't easily determined by inspecting just the AST, which is why people building compilers and optimizers build other program representations (including symbol tables, control flow graphs, reaching definitions, data flow and SSA forms, etc.). Having a parser for a language is the easy part of analyzing/manipulating that language. You need all that other stuff to do a good job. If you do have all those other representations, you can think about doing optimizations at the AST level. Most folks building compilers don't bother; they convert to a data flow representation and simply optimize that. But if you want to reproduce source code with changes, you need the AST. You'll also need a prettyprinter to enable you to regenerate the source code. If you go this far, you'll end up with a source-to-source program transformation system. The DMS Software Reengineering Toolkit is a system that transforms ASTs, using all these other representations to enable the analyses needed by the transforms.

Your favourite Abstract Syntax Tree optimization

2 Answers

Mostly you can't do interesting optimizations at the AST level, because you need information how how data flows from one part of the program to another. While data flow is implicit in the meaning of the AST, it isn't easily determined by inspecting just the AST, which is why people building compilers and optimizers build other program representations (including symbol tables, control flow graphs, reaching definitions, data flow and SSA forms, etc.).

Having a parser for a language is the easy part of analyzing/manipulating that language. You need all that other stuff to do a good job.

If you do have all those other representations, you can think about doing optimizations at the AST level. Most folks building compilers don't bother; they convert to a data flow representation and simply optimize that. But if you want to reproduce source code with changes, you need the AST. You'll also need a prettyprinter to enable you to regenerate the source code. If you go this far, you'll end up with a source-to-source program transformation system.

The DMS Software Reengineering Toolkit is a system that transforms ASTs, using all these other representations to enable the analyses needed by the transforms.

122

answered Sep 25 '22 01:09

Ira Baxter

An optimisation that is easiest to do on the AST (rather than, say, the CFG) is tail-call optimisation: if you see a subtree of the form:

RETURN
    CALL f
        ARGS x, y, ...

You can replace it with a jump to f. If f(a, b) is the function that the tail-call appears in, the replacement is as simple as:

a = x; b = y
JUMP to root of tree

I find it easiest to represent that jump as a special "restart" statement, which the AST->CFG translation treats as an edge back to the first node. Jumping to other functions is a bit trickier since you can't just set local variables, you need to actually think ahead how arguments are passed to them and prepare to translate this at a lower level. For example, the AST might need a special node that can instruct a later pass to overwrite the current stack frame with the arguments and jump accordingly.

answered Sep 23 '22 01:09

Edmund

Related questions
                            
                                Unexpected behavior between [Flags] enum : long vs [Flags] enum : ulong
                            
                                C++ Forward Declaration Problem when calling Method
                            
                                Regular expression to match string of 0's and 1's without '011' substring
                            
                                C++ builder how to configure compiler to output just exe?
                            
                                Is it possible to manually calculate the byte-offset of a class member?
                            
                                How do machines interpret binary?
                            
                                gcc options: warning on non-void functions without a return statement
                            
                                g++ -E option output
                            
                                How to get the register name from the load instruction in llvm
                            
                                Should a descendant class' method's variable that is identical to Self, have access to its ancestor's protected methods?
                            
                                How can a language be interpreted by itself (like Rubinius)?
                            
                                Encrypting password in compiled C or C++ code
                            
                                What does back slash "\" really mean?
                            
                                Why I can't write "implements AClass" in Java?
                            
                                Compiling Java Generics with Wildcards to C++ Templates
                            
                                Unknown conversion types in this program I'm meant to compile
                            
                                Do C++ compilers eliminate assignment duplications?
                            
                                How should I parse keywords when writing a C Compiler?
                            
                                Difference between if as an expression and if as a statement
                            
                                How can I use gcc to compile x86 assembly code on an x64 computer

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Your favourite Abstract Syntax Tree optimization

Tags:

compiler-optimization

compiler-construction

abstract-syntax-tree

Flavius

People also ask

2 Answers

Ira Baxter

Edmund

Recent Activity

Donate For Us