Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compilers: How to parse function calls and function definitions

Upfront, I'm making an interpreter in Python, not an actual compiler that compiles to machine code. I've been skimming quite a few compiler construction guides lately and I understand the basics of generating tokens for your parser and building a syntax tree to evaluate arithmetic expressions, but I don't quite understand how to parse expressions with function calls inside them, things like

Fig. (a)

1 + pow(1, 1)

or how to parse lines when the user is defining a function like this

Fig. (b)

function newFunction( someArgs ){
   ... some magic ...
}

In Fig. (a), how should I tokenize this expression? After reading the reserved word "pow" should I grab everything up to the closing parenthesis and pass that to a parser? or should I include "pow", "(", "1", "1", and ")" each as seperate tokens and add them to my parse tree?

In Fib. (b) I don't have any idea where to even start when it comes to compiling function definitions. Any information to put me in the right direction would be appreciated.

Edit: I'm using a Backus-Naur form grammar:

S ::= expression

expression ::= term | term ([+-] term)+

term ::= factor | factor ([*/] factor)+

factor ::= number | ( expression )

number ::= [0-9]+

like image 876
kjh Avatar asked Oct 02 '22 23:10

kjh


1 Answers

Another poster suggests adding the names of functions to your grammar.

That works for toy languages, but not for practical ones where there may be huge libraries as well as large sets of user defined functions.

You can handle the latter by adding function calls to the BNF, in a way that leaves the function name out of the grammar:

S ::= expression

expression ::= term | term ([+-] term)+

term ::= factor | factor ([*/] factor)+

factor ::= number | ( expression ) | identifier |  functioncall

functioncall ::= identifier [(]  arguments [)]

arguments ::= empty | arguments 

arguments ::= expression |  arguments [,] expression

number ::= [0-9]+

identifier ::= [a-z]+

Now your parser can pick up function calls. (Left out of the grammer is a way to define functions... but that's just more syntax which I leave it for you to add).

The price for this is that after parsing, something has decide for each function name, precisely which bit of code it represents. You will need a symbol table to do this. That's another question.

like image 147
Ira Baxter Avatar answered Oct 10 '22 13:10

Ira Baxter