I have an AST derived from the ANTLR Parser Generator for Java. What I want to do is somehow construct a control flow graph of the source code, where each statement or expression is a unique Node. I understand there must be some recursiveness to this identification, I was wondering what you would suggest as the best option and if ANTLR has a toolset I can use for this job. Cheers, Chris
EDIT - My main concern is to get a control flow graph(CFG) from the AST. This way I can get a tree representation of the source. To clarify, both the source code and the implementation language is Java.
Usually CFGs are computed on a lower-level representation (e.g. JVM bytecode). Someone did a thesis on such things a few years ago. There might be a helpful way described in there for how to get at that representation.
Since your source and target languages are the same, there's no code generation step -- you're already done! However, now you get to walk the AST. At each node of the AST, you have to ask yourself: is this a "jumping" instruction or not? Method calls and if statements are examples of jumping instructions. So are loop constructs (such as for and while). Instructions such as addition and multiplication are non-jumping.
First associate with each java statement a node in the CFG, along with an entry and exit node. As a first approximation, walk the tree and:
This will give you some kind of CFG. The procedure is slightly hairy in step 2 because the method called may be declared in a library, and not elsewhere in the AST -- if so, either don't make an edge or make an edge to a special node representing the entry to that library method.
Does this make sense?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With