Im writing a compiler for university project, and I would like to transform my Abstract Syntax Tree into a Control Flow Graph(CFG). Im thinking that the nodes(<code>V</code>) in the CFG should be nodes from the AST. I know algorithmically how to construct the edge set (<code>G=(V,E)</code>) but Im having a hard time writing the process a bit more formally I've created this scala style pattern matching (Pseudo): <pre class="prettyprint"><code>def edges(n:Node)(nestedin_next: Node) : List[(Node,Node)] = n match { case (c_1 :: c_2::tl) => (c1,c2) :: edges(c2::tl)(nestedin_next)++ edges(c_1)(c_2)//recurse case c_1 :: Nil => (c_1,nestedin_next)::Nil case i@ IF(_,c1,c2) => (i,c1)::(i,c2)::edges(c1)(nestedin_next)++ edges(c2)(nestedin_next) case _ => Nil } </code></pre> Which should match an AST structure like: <pre class="prettyprint"><code>( IF(1, ASSIGN(x,1), // ia1 ASSIGN(x,2) // ia2 ) :: // i1 ASSIGN(y,2) :: // a1 ASSIGN(z,ADD(x,y)) :: //a2 IF(z, RET(z), //i2r1 assign(z,0):: // i2a1 ret(z) // i2r2 ) :://i2 Nil ) </code></pre> and provide an edgeset like: <pre class="prettyprint"><code>{ i1 -> ia1, i1 -> ia2, ia1 -> a1, ia2 -> a1, a1 -> a2, a2 -> i2, i2 -> i2r1 i2-> i2a1 i2a1 -> i2r2 i2r2 -> _|_ i2r1 -> _|_ } </code></pre> <img src="https://i.stack.imgur.com/5nyd8.png" alt="CFG(dot)">DotSrc Anyone got any hints on how to do this a bit more formally than scala "pseudocode"? Im thinking something inductive like: <pre class="prettyprint"><code>e[[ IF(_,b1,b2) ]] = (if -> b1) + (if -> b2) \cup e[[ b1 ]] \cup e[[ b2 ]] e[[ b1, b2 ]] = e[[b1]] \cup e[[b2]] </code></pre> (the above would only give a tree and not a graph though. No edge from edge of then-branch to next statement for example) EDIT: I've been reading up on kiama and dataflows for scala, and I like the "succ" and "following" approach they use. Nevertheless, I'm having a hard time boiling that down into a more formal description, mostly because of the nifty <code>childAttr</code>, <code>s.next</code> which hides some of the details that turns ugly when I try to specify it formally. EDIT2: I've been through the Dragon Book and "Modern Compiler Implementation in ML" as well as some of the other material from Learning to write a compiler and some/most mentions data flow and control flow, but never touches much upon HOW to create the CFG in any formal way. EDIT3: Via Kiama author, Associate Professor Dr. Tony Sloane I recieved some additional book references to look up. As far as I can see the "way to do it" as per those books is based on a "per statement" of the program more than over the AST and is based on Basic Blocks. Great input nevertheless!

If your intention is to simply create something that looks a bit more formal, then you could express these matching operations as inference rules using the standard notation. You should express it in terms of a single step of reduction, rather than recursively, because then it is sufficient to simply keep applying these rules until no more can be applied. That said, this definition is essentially going to say exactly the same thing as your scala code. If you really want to do anything "formal" the properties you need to prove are: <ul> <li>Your CFG translation algorithm always terminates</li> <li>Whether your CFG is minimal with respect to a given AST input</li> <li>Whether there is a unique CFG derivable by your algorithm for a given AST input (i.e. it's not non-deterministic which CFG it produces).</li> </ul> I don't think your basic blocks approach (rather than a per-statement approach) is necessarily a bad idea, either. It seems perfectly reasonable that if you can match a basic block, you can write a rule that makes assertions about set membership based upon the presence of this match. It seems like the inductive definition you started sketching could work just fine. Something else interesting might be to try to relate (formally) structured operational semantics and your construction of CFGs. There might already be work in this area, but I only did a cursory google search and didn't find any clearly stated relationship between the two, but intuitively it seems like one should exist.

Formally constructing Control Flow Graph

Tags:

language-agnostic

scala

compiler-construction

compiler-theory

Im writing a compiler for university project, and I would like to transform my Abstract Syntax Tree into a Control Flow Graph(CFG).

Im thinking that the nodes(V) in the CFG should be nodes from the AST. I know algorithmically how to construct the edge set (G=(V,E)) but Im having a hard time writing the process a bit more formally

I've created this scala style pattern matching (Pseudo):

Click to copy

def edges(n:Node)(nestedin_next: Node) : List[(Node,Node)] = 
    n match {
       case (c_1 :: c_2::tl) => (c1,c2) :: edges(c2::tl)(nestedin_next)++
                                   edges(c_1)(c_2)//recurse
       case c_1 :: Nil => (c_1,nestedin_next)::Nil
       case  i@ IF(_,c1,c2) => (i,c1)::(i,c2)::edges(c1)(nestedin_next)++
                                edges(c2)(nestedin_next)
       case _ => Nil
     }

Which should match an AST structure like:

Click to copy

( IF(1,
       ASSIGN(x,1), // ia1
       ASSIGN(x,2) // ia2
     ) ::  // i1
  ASSIGN(y,2) ::  // a1
  ASSIGN(z,ADD(x,y)) :: //a2 
  IF(z, 
       RET(z), //i2r1
         assign(z,0):: // i2a1
         ret(z) // i2r2
  ) :://i2
   Nil
)

and provide an edgeset like:

Click to copy

{ i1 -> ia1,
   i1 -> ia2,
   ia1 -> a1,
   ia2 -> a1,
   a1 -> a2,
   a2 -> i2,
   i2 -> i2r1
   i2-> i2a1
   i2a1 -> i2r2
   i2r2 -> _|_
   i2r1 -> _|_ 
}

CFG(dot) DotSrc

Anyone got any hints on how to do this a bit more formally than scala "pseudocode"?

Im thinking something inductive like:

Click to copy

e[[ IF(_,b1,b2) ]] = (if -> b1) + (if -> b2) \cup e[[ b1 ]] \cup e[[ b2 ]]
e[[ b1, b2 ]] = e[[b1]] \cup e[[b2]]

(the above would only give a tree and not a graph though. No edge from edge of then-branch to next statement for example)

EDIT:

I've been reading up on kiama and dataflows for scala, and I like the "succ" and "following" approach they use. Nevertheless, I'm having a hard time boiling that down into a more formal description, mostly because of the nifty childAttr, s.next which hides some of the details that turns ugly when I try to specify it formally.

EDIT2:

I've been through the Dragon Book and "Modern Compiler Implementation in ML" as well as some of the other material from Learning to write a compiler and some/most mentions data flow and control flow, but never touches much upon HOW to create the CFG in any formal way.

EDIT3:

Via Kiama author, Associate Professor Dr. Tony Sloane I recieved some additional book references to look up.

As far as I can see the "way to do it" as per those books is based on a "per statement" of the program more than over the AST and is based on Basic Blocks. Great input nevertheless!

859

asked Sep 01 '10 10:09

svrist

2 Answers

Google's Closure Compiler implements a Control-Flow Analysis which transforms an AST for JavaScript into a Control-Flow Graph. The ideas for this implementation are inspired from the paper: Declarative Intraprocedural Flow Analysis of Java Source Code.

195

answered Sep 29 '22 11:09

namin

If your intention is to simply create something that looks a bit more formal, then you could express these matching operations as inference rules using the standard notation. You should express it in terms of a single step of reduction, rather than recursively, because then it is sufficient to simply keep applying these rules until no more can be applied.

That said, this definition is essentially going to say exactly the same thing as your scala code. If you really want to do anything "formal" the properties you need to prove are:

Your CFG translation algorithm always terminates
Whether your CFG is minimal with respect to a given AST input
Whether there is a unique CFG derivable by your algorithm for a given AST input (i.e. it's not non-deterministic which CFG it produces).

I don't think your basic blocks approach (rather than a per-statement approach) is necessarily a bad idea, either. It seems perfectly reasonable that if you can match a basic block, you can write a rule that makes assertions about set membership based upon the presence of this match. It seems like the inductive definition you started sketching could work just fine.

Something else interesting might be to try to relate (formally) structured operational semantics and your construction of CFGs. There might already be work in this area, but I only did a cursory google search and didn't find any clearly stated relationship between the two, but intuitively it seems like one should exist.

answered Sep 29 '22 12:09

Gian

Related questions
                            
                                Real world implementations of "classical algorithms"
                            
                                Transition from business to game programming [closed]
                            
                                A or B, not both, not neither
                            
                                Would it be bad form to put braces on the same line as the statement for single line "if" statements?
                            
                                Is it ok to put comments about bug fixes in the source code?
                            
                                Why use the switch statment instead of if else statments?
                            
                                What's the difference between the open source licences
                            
                                Do editors allowing you to view source code in different indent style exist?
                            
                                Best data structure for an immutable persistent 3D grid
                            
                                Known algorithm for efficiently distributing items and satisfying minima?
                            
                                What are some good strategies to allow deployed applications to be hotfixable?
                            
                                Algorithm possible amounts (over)paid for a specific price, based on denominations
                            
                                Defining a security policy for a system
                            
                                Syntax Highlighting tool/language that works with multiple editors
                            
                                Permutations of a binary tree
                            
                                What's the fastest algorithm to perform exponentiation?
                            
                                Position of a point relative to a Bezier curve
                            
                                Methods for storing metadata associated with individual files?
                            
                                incremental k-core algorithm
                            
                                Are exception hierarchies really useful?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Formally constructing Control Flow Graph

Tags:

language-agnostic

scala

compiler-construction

compiler-theory

svrist

People also ask

2 Answers

namin

Gian

Recent Activity

Donate For Us