Upfront, I'm making an interpreter in Python, not an actual compiler that compiles to machine code. I've been skimming quite a few compiler construction guides lately and I understand the basics of generating tokens for your parser and building a syntax tree to evaluate arithmetic expressions, but I don't quite understand how to parse expressions with function calls inside them, things like
Fig. (a)
1 + pow(1, 1)
or how to parse lines when the user is defining a function like this
Fig. (b)
function newFunction( someArgs ){
... some magic ...
}
In Fig. (a), how should I tokenize this expression? After reading the reserved word "pow" should I grab everything up to the closing parenthesis and pass that to a parser? or should I include "pow", "(", "1", "1", and ")" each as seperate tokens and add them to my parse tree?
In Fib. (b) I don't have any idea where to even start when it comes to compiling function definitions. Any information to put me in the right direction would be appreciated.
Edit: I'm using a Backus-Naur form grammar:
S ::= expression
expression ::= term | term ([+-] term)+
term ::= factor | factor ([*/] factor)+
factor ::= number | ( expression )
number ::= [0-9]+
Another poster suggests adding the names of functions to your grammar.
That works for toy languages, but not for practical ones where there may be huge libraries as well as large sets of user defined functions.
You can handle the latter by adding function calls to the BNF, in a way that leaves the function name out of the grammar:
S ::= expression
expression ::= term | term ([+-] term)+
term ::= factor | factor ([*/] factor)+
factor ::= number | ( expression ) | identifier | functioncall
functioncall ::= identifier [(] arguments [)]
arguments ::= empty | arguments
arguments ::= expression | arguments [,] expression
number ::= [0-9]+
identifier ::= [a-z]+
Now your parser can pick up function calls. (Left out of the grammer is a way to define functions... but that's just more syntax which I leave it for you to add).
The price for this is that after parsing, something has decide for each function name, precisely which bit of code it represents. You will need a symbol table to do this. That's another question.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With