Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Symbol Table created by the C++ compiler

I was reading Effective C++, 3rd edition and in item 2 (prefer const, enums, and inlines to #defines), Scott Meyers mentions the symbol table: he explains that #defines may not appear in the symbol table.

Based on the answer here, a bit of the suggested reading thereof, and the Wikipedia article, I would define the symbol table as follows: since the compiler only creates object files for every translation unit, we still need a way to reference symbols between the translation units. This is done using a table that is created for every object file so that symbols can be defined at a later stage - by the linker when the executable/library is created from the object files. During linking, symbols are substitutes with their appropriate memory addresses by the linker.

Here's what I'd like to know:

  • Is my interpretation above correct?
  • After linking, once the memory addresses have been resolved, I don't think that symbol tables are required? That is, I think that the symbol table won't be available in the executable/library; is that correct?
  • I suspect that the symbol table is also useful for other compiler tasks? Something like identifying naming conflicts perhaps?
  • The symbol table described above is not the same as the export table. In the context of Visual C++ at least, the export table defines the symbols that are explicitly declared as visible outside of the library. I suppose in a sense this is a table of symbols - but not related to the symbol table Scott is referring to.
  • Is there anything else that's interesting about the symbol table? That is, is there any additional insight about symbol tables that I ought to have?

Thank you for your time and contribution.

like image 926
Pooven Avatar asked Oct 22 '14 19:10

Pooven


People also ask

What are the types of symbol table in compiler?

A compiler maintains two types of symbol tables: a global symbol table which can be accessed by all the procedures and scope symbol tables that are created for each scope in the program. To determine the scope of a name, symbol tables are arranged in hierarchical structure as shown in the example below: . . .

Which component of a compiler uses the symbol table?

Symbol table is an important data structure used in a compiler. Symbol table is used to store the information about the occurrence of various entities such as objects, classes, variable name, interface, function name etc. it is used by both the analysis and synthesis phases.

Which phase of the compiler uses symbol table?

The information in the symbol table is entered in the lexical analysis and syntax analysis phase, however, is used in later phases of compiler (semantic analysis, intermediate code generation, code optimization, and code generation).

Does compiler use symbol table?

Virtually every phase of the compiler will use the symbol table: The initialization phase will place keywords, operators, and standard identifiers in it. The scanner will place user-defined identifiers and literals in it and will return the corresponding token.


1 Answers

Symbol tables exist both for the compiler (and then the compiler puts even local variable symbols in them; even the preprocessor has some sort of symbol tables for #define-d names, but the preprocessor might be inside the compiler today) and for the linker. But these are different tables, organized differently.

The linker symbol table is mostly for exported or imported global symbols. Be aware that the linker performs some relocation. Be aware that the linker is behaving quite differently on Windows and on Linux (dllimport on Windows, __attribute__(visibility...) on Linux). Notice that for dynamic libraries, some linking happens at runtime (dynamic loading). For C++, name mangling can happen. Read also about vague linkage & template instantiation & link-time optimization in GCC...

Read also Levine's book: Linkers and Loaders and e.g. the wikipage on the ELF format (used for object files, shared libraries and executables on Linux and many Unix systems).

If you have access to some Linux system, use the readelf(1), nm(1) and objdump(1) utilities. Read also Drepper's paper: How To Write Shared Libraries (on Linux)

like image 177
Basile Starynkevitch Avatar answered Oct 11 '22 13:10

Basile Starynkevitch