Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to make a symbol table

We have as an assignment to make a compiler. We have already made the lexical and syntax analysis but we are stuck at generation of intermediate code. We realized that we have to implement a symbol table in order to proceed to intermediate code generation and we don't know, how to do it and what it contains.

Given the code below, what should the symbol table contain? (The code is written in an educational language which is described below)

Also how can we implement scopes in our symbol table?

<PROGRAM> ::= PROGRAM ID <BLOCK> ENDPROGRAM
<BLOCK> ::= {<DECLARATIONS> <SUBPROGRAMS> <SEQUENCE>}
<DECLARATIONS> ::= ε | DECLARE <VARLIST> ENDDECLARE
<VARLIST> ::= ε | ID ( , ID )*
<SUBPROGRAMS> ::= ( <PROCORFUNC> ) *
<PROCORFUNC> ::= PROCEDURE ID <PROCORFUNCBODY> ENDPROCEDURE |
FUNCTION ID <PROCORFUNCBODY> ENDFUNCTION
<PROCORFUNCBODY> ::= <FORMALPARS> <BLOCK>
<FORMALPARS> ::= ε | ( <FORMALPARLIST> )
<FORMALPARLIST> ::= <FORMALPARITEM> ( , <FORMALPARITEM> )*
<FORMALPARITEM> ::= IN ID | INOUT ID
<SEQUENCE> ::= <STATEMENT> ( ; <STATEMENT> )*
<STATEMENT> ::= ε | <ASSIGNMENT-STAT> |
<IF-STAT> |
<WHILE-STAT> |
<FOR-STAT> |
<EXIT-STAT> |
<CALL-STAT> |
<RETURN-STAT>
<ASSIGNMENT-STAT> ::= ID := <EXPRESSION>
<IF-STAT> ::= IF <CONDITION> THEN <SEQUENCE> <ELSEPART> ENDIF
<ELSEPART> ::= ε | ELSE <SEQUENCE>
<WHILE-STAT> ::= DO {<SEQUENCE>} WHILE (<CONDITION>)
<FOR-STAT> ::= (<ASSIGNMENT-STAT>; <CONDITION>;<ASSIGNMENT-STAT>;)
{<SEQUENCE>}
<EXIT-STAT> ::= EXIT
<CALL-STAT> ::= CALL ID <ACTUALPARS>
<ACTUALPARS> ::= ( <ACTUALPARLIST> ) | ε
<ACTUALPARLIST> ::= <ACTUALPARITEM> ( , <ACTUALPARITEM> )*
<ACTUALPARITEM> ::= IN <EXPRESSION> | INOUT ID
<RETURN-STAT> ::= RETURN <EXPRESSION>
<CONDITION> ::= <BOOLTERM> (OR <BOOLTERM>)*
<BOOLTERM> ::= <BOOLFACTOR> (AND <BOOLFACTOR>)*
<BOOLFACTOR> ::= NOT [<CONDITION>] | [<CONDITION>] |
<EXPRESSION> <RELATIONAL-OPER> <EXPRESSION> |
TRUE | FALSE
<EXPRESSION> ::= <OPTIONAL-SIGN> <TERM> ( <ADD-OPER> <TERM>)*
<TERM> ::= <FACTOR> (<MUL-OPER> <FACTOR>)*
<FACTOR> ::= CONSTANT | (<EXPRESSION>) | ID <IDTAIL>
<IDTAIL> ::= ε | <ACTUALPARS>
<RELATIONAL-OPER> ::= = | < ( ε | = | > ) | > ( ε | = )
<ADD-OPER> ::= + | -
<MUL-OPER> ::= * | /
<OPTIONAL-SIGN> ::= ε | <ADD-OPER>
PROGRAM MULTIPLY
    {
    DECLARE
    A, B, C
    ENDDECLARE
    PROCEDURE Aop(INOUT A)
    {
        A=A+1;
    }
    ENDPROCEDURE
    FUNCTION Bop(IN B){
        IF [NOT[[TRUE AND FALSE]OR[TRUE]]] THEN B := 100 / 2;
        ELSE B := 100;
        ENDIF;
        RETURN B;
        }
    ENDFUNCTION
    CALL Aop(INOUT A);
    CALL Bop(IN B);
    A := 40;
    C := A * B;
    }
ENDPROGRAM
like image 596
bottled_ghost Avatar asked Dec 16 '11 13:12

bottled_ghost


People also ask

What is a symbol table in data structure?

In computer science, a symbol table is a data structure used by a language translator such as a compiler or interpreter, where each identifier (or symbols), constants, procedures and functions in a program's source code is associated with information relating to its declaration or appearance in the source.

How do you implement symbol table in C++?

A Symbol table is a data structure used by the compiler, where each identifier in program's source code is stored along with information associated with it relating to its declaration. It stores identifier as well as it's associated attributes like scope, type, line-number of occurrence, etc.


1 Answers

A symbol table maps identifiers (typically prefixed by a scope name) to information about that identifier, such as its symbol type (local variable / parameter / function / class etc.), data type, its order relative to the other identifiers in the same scope, its source code line, etc. The symbol table can be generated by traversing the abstract syntax tree, by always keeping track of which scope you're in and adding information to the symbol table whenever you hit a variable declaration. In your example, a part of the symbol table might look like this (mapping to symbol type, data type, position, and source code line):

MULTIPLY.A -> {"LOCAL", "INT", 0, 4}
MULTIPLY.B -> {"LOCAL", "INT", 1, 4}
MULTIPLY.C -> {"LOCAL", "INT", 2, 4}
MULTIPLY.Aop -> {"FUNCTION", "INT", 3, 4}
MULTIPLY.Aop.A -> {"INOUTPARAM", "INT", 0, 6}

Now, you can resolve all variable references. For instance, in the expression A := A + 1, if you know that your current scope is MULTIPLY.Aop, the symnbol table will let you find out that this A is an input/output parameter of type INT, and that it is the first parameter (this information will let you generate a stack address offset so that you can load/store the variable).

like image 110
Aasmund Eldhuset Avatar answered Sep 20 '22 14:09

Aasmund Eldhuset