We have as an assignment to make a compiler. We have already made the lexical and syntax analysis but we are stuck at generation of intermediate code. We realized that we have to implement a symbol table in order to proceed to intermediate code generation and we don't know, how to do it and what it contains.
Given the code below, what should the symbol table contain? (The code is written in an educational language which is described below)
Also how can we implement scopes in our symbol table?
<PROGRAM> ::= PROGRAM ID <BLOCK> ENDPROGRAM <BLOCK> ::= {<DECLARATIONS> <SUBPROGRAMS> <SEQUENCE>} <DECLARATIONS> ::= ε | DECLARE <VARLIST> ENDDECLARE <VARLIST> ::= ε | ID ( , ID )* <SUBPROGRAMS> ::= ( <PROCORFUNC> ) * <PROCORFUNC> ::= PROCEDURE ID <PROCORFUNCBODY> ENDPROCEDURE | FUNCTION ID <PROCORFUNCBODY> ENDFUNCTION <PROCORFUNCBODY> ::= <FORMALPARS> <BLOCK> <FORMALPARS> ::= ε | ( <FORMALPARLIST> ) <FORMALPARLIST> ::= <FORMALPARITEM> ( , <FORMALPARITEM> )* <FORMALPARITEM> ::= IN ID | INOUT ID <SEQUENCE> ::= <STATEMENT> ( ; <STATEMENT> )* <STATEMENT> ::= ε | <ASSIGNMENT-STAT> | <IF-STAT> | <WHILE-STAT> | <FOR-STAT> | <EXIT-STAT> | <CALL-STAT> | <RETURN-STAT> <ASSIGNMENT-STAT> ::= ID := <EXPRESSION> <IF-STAT> ::= IF <CONDITION> THEN <SEQUENCE> <ELSEPART> ENDIF <ELSEPART> ::= ε | ELSE <SEQUENCE> <WHILE-STAT> ::= DO {<SEQUENCE>} WHILE (<CONDITION>) <FOR-STAT> ::= (<ASSIGNMENT-STAT>; <CONDITION>;<ASSIGNMENT-STAT>;) {<SEQUENCE>} <EXIT-STAT> ::= EXIT <CALL-STAT> ::= CALL ID <ACTUALPARS> <ACTUALPARS> ::= ( <ACTUALPARLIST> ) | ε <ACTUALPARLIST> ::= <ACTUALPARITEM> ( , <ACTUALPARITEM> )* <ACTUALPARITEM> ::= IN <EXPRESSION> | INOUT ID <RETURN-STAT> ::= RETURN <EXPRESSION> <CONDITION> ::= <BOOLTERM> (OR <BOOLTERM>)* <BOOLTERM> ::= <BOOLFACTOR> (AND <BOOLFACTOR>)* <BOOLFACTOR> ::= NOT [<CONDITION>] | [<CONDITION>] | <EXPRESSION> <RELATIONAL-OPER> <EXPRESSION> | TRUE | FALSE <EXPRESSION> ::= <OPTIONAL-SIGN> <TERM> ( <ADD-OPER> <TERM>)* <TERM> ::= <FACTOR> (<MUL-OPER> <FACTOR>)* <FACTOR> ::= CONSTANT | (<EXPRESSION>) | ID <IDTAIL> <IDTAIL> ::= ε | <ACTUALPARS> <RELATIONAL-OPER> ::= = | < ( ε | = | > ) | > ( ε | = ) <ADD-OPER> ::= + | - <MUL-OPER> ::= * | / <OPTIONAL-SIGN> ::= ε | <ADD-OPER>
PROGRAM MULTIPLY
{
DECLARE
A, B, C
ENDDECLARE
PROCEDURE Aop(INOUT A)
{
A=A+1;
}
ENDPROCEDURE
FUNCTION Bop(IN B){
IF [NOT[[TRUE AND FALSE]OR[TRUE]]] THEN B := 100 / 2;
ELSE B := 100;
ENDIF;
RETURN B;
}
ENDFUNCTION
CALL Aop(INOUT A);
CALL Bop(IN B);
A := 40;
C := A * B;
}
ENDPROGRAM
In computer science, a symbol table is a data structure used by a language translator such as a compiler or interpreter, where each identifier (or symbols), constants, procedures and functions in a program's source code is associated with information relating to its declaration or appearance in the source.
A Symbol table is a data structure used by the compiler, where each identifier in program's source code is stored along with information associated with it relating to its declaration. It stores identifier as well as it's associated attributes like scope, type, line-number of occurrence, etc.
A symbol table maps identifiers (typically prefixed by a scope name) to information about that identifier, such as its symbol type (local variable / parameter / function / class etc.), data type, its order relative to the other identifiers in the same scope, its source code line, etc. The symbol table can be generated by traversing the abstract syntax tree, by always keeping track of which scope you're in and adding information to the symbol table whenever you hit a variable declaration. In your example, a part of the symbol table might look like this (mapping to symbol type, data type, position, and source code line):
MULTIPLY.A -> {"LOCAL", "INT", 0, 4}
MULTIPLY.B -> {"LOCAL", "INT", 1, 4}
MULTIPLY.C -> {"LOCAL", "INT", 2, 4}
MULTIPLY.Aop -> {"FUNCTION", "INT", 3, 4}
MULTIPLY.Aop.A -> {"INOUTPARAM", "INT", 0, 6}
Now, you can resolve all variable references. For instance, in the expression A := A + 1
, if you know that your current scope is MULTIPLY.Aop
, the symnbol table will let you find out that this A
is an input/output parameter of type INT
, and that it is the first parameter (this information will let you generate a stack address offset so that you can load/store the variable).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With