Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to generate code for AST tree parsed from a fictive language?

I read the article over at http://parsingintro.sourceforge.net/ and decided to try to rewrite it as an exercise in Ruby. Two reasons made me do this, I wanted to learn more about how to code Ruby (background in Java, PHP, C and some Python) and I wanted to learn more about parsers / compilers.

I have all the code posted at https://github.com/parse/boatcaptain. The AST tree is being generated, unfortunatly the author of the article doesn't get into concepts such as code generation and optimizations.

Can anyone help me by pointing me in the right direction on how to achieve this AST tree into "code"? This is the AST tree that is generated

I wrote a calculator in Java a few years ago, it uses a lot of similar terminology and techniques as I used in this parser. But in the calculator I had methods for eval()-ing my "classes" and therefore getting output, should I aim for doing something similar here? Source for calculator: https://github.com/parse/Uppsala-University-Courses/blob/master/ImpOOP-Calculator/src/Calculator.java

I would love feedback on my way of writing Ruby as well, I believe I still write Ruby like I would write Python, missing some nice advantages of Ruby.

like image 571
Anders H Avatar asked Feb 28 '12 04:02

Anders H


Video Answer


1 Answers

Code Generation in its most basic form is simply traversing your intermediate form - the AST - and emitting corresponding instructions in your target language.

Firstly you'll need to choose a target language. What platform do you want your input file to run on? The main options open to you are:

  • A source-to-source translator
  • A compiler to native code
  • A compiler to bytecode (to be run just-in-time on a VM)

The choice of target language can determine the amount of work you'll have to put in to map between languages. Mapping object-oriented classes down to ASM could/would be tricky, for example. Mapping inherently procedural code to stack-based code could also prove a challenge.

Whichever language you choose, the problem will no doubt boil down to the following procedure: visit the nodes of your tree, and depending on their type, emit the corresponding instruction.

Say you come across the following node in your AST (as in the one you linked to):

        =
delta       /
      alpha   beta

Seeing as it's an 'assignment' node, the Code Generator then knows it has to evaluate the RHS of the tree before sticking that value into the LHS; 'delta'. So we follow the RHS node down, and see it is a division operation. We then know we have to evaluate both the LHS and RHS of this node, before dividing them, and sticking the result in 'delta'.

So now we move down the LHS, see it's a variable, and we emit a 'load' instruction. We go back up and then down the RHS, and likewise emit a 'load' for 'beta'. We then walk back up the tree (taking both alpha and beta with us), emit the divide instruction on both the operands, store that result, pass it up the tree to the assignment emitter, and let that store it in 'delta'.

So the resulting code for this snippet might be:

load alpha
load beta
tmp = div alpha beta
store delta tmp

As for pre-existing Ruby Code Generator libraries, I'm not aware of any, sorry. I hope this answer wasn't too general or simplistic for you.

like image 86
Fraser Cormack Avatar answered Oct 06 '22 00:10

Fraser Cormack