Learning how programming languages work

Tags:

interpreter

I've been programming for years (mainly Python), but I don't understand what happens behind the scenes when I compile or execute my code.

In the vein of a question I asked earlier about operating systems, I am looking for a gentle introduction to programming language engineering. I want to be able to define and understand the basics of terms like compiler, interpreter, native code, managed code, virtual machine, and so on. What would be a fun and interactive way to learn about this?

423

asked Oct 04 '09 08:10

RexE

4 Answers

Code to execution in a nutshell

A program (code) is fed into the compiler (or interpretor).

Characters are used to form tokens (+ , identifiers, numbers) and their value is stored in some thing called a symbol table.

These tokens are put together to form statements: (int a = 6 + b * c;). Mostly in the form of a syntax tree:

Click to copy

Within an interpretor the tree is executed directly.

With a compiler, the tree is finally translated into either intermediate code or assembler code.

You now have one or more "object files". These contain the assembler code without the precise jumps (because these values are not known yet especially if the targets are in other object files). The object files are linked together with a linker which fills in the blanks for the jumps (ans references). The output of the linker is a library (which can be linked too) or an executable file.

If you start the executable, the program data is copied into memory and there is some other link jugling to match the pointers with the correct memory locations. And then control is given to the first instruction.

179

answered Sep 30 '22 14:09

Toon Krijthe

In basic terms, you write source files. These are fancy text files, which are taken in by the compiler which outputs some form of executable code (what executes it depends on the type of code you're talking about). The compiler has several parts:

Some form of preprocessing on the file which handles macros and the like (like from C).
A parser, which takes in source files, verifies that they conform to the syntactic rules of your language, and transforms the file into an in-memory data structure that is more easily manipulable by other parts of the program. This is called an Abstract Syntax Tree or AST.
Some form of AST analysis, which verifies that the actual code you wrote does not violate any rules of the language (e.g. recursion in a language that does not support it), as well as many other things.
Optimization such as tail call optimization, loop optimization, and many other kinds of optimizations.
Code generation, which is the actual process of taking the final AST and any other generated data and turning it into a binary file of some sort that can be executed or interpreted.

Interpreter:

An interpreter is a program that takes in some form of binary data that represents a program not compiled to code directly executable by the target machine, and runs the commands within. Examples are python, java, and lua.

Native code:

This is code that has been compiled into native instructions directly executable by the target machine. For instance; if you run on an x86 architecture then c++ will compile to an executable file that is understandable by the processor.

Virtual Machine:

This is generally a program built to simulate the construction and operation of a processor. It may be as simple as a program that reads in bytecode and runs native language operations based on the commands the bytecode represents (though calling this a virtual machine may be a stretch), or it may be as complex as completely simulating the behavior of a processor and all associated peripherals.

those other answers have good points in them but this info and links ought to get you started. Any other questions, just ask!

(Most of this article was written with the help of wikipedia though some was written from memory)

answered Sep 30 '22 14:09

RCIX

compilers, interpreters and virtual machines are just examples of implementation details. What you might look for is programming languages theory, generative grammar, language translators, and you need possibly some computer architecture to relate theory with implementations.

Personally, I learned from Sebesta's book. It gives a very wide introduction to the subject without going into minute details. It also, has a good chapter on the history of programming languages (~20 languages ~3 papers per language). It has nice explanation about grammars and theory of languages in general. Also, It gives a good introduction into Scheme, Prolog, and programming paradigms (Logic, Functional, Imperative^, Object oriented).

^ It concentrate a lot more on the imperative paradigm than the first two.

answered Sep 30 '22 14:09

Khaled Alshaya

This site has a great series of lectures on the Structure and Interpretation of Computer Programs, which is exactly the type of thing you are wanting to learn. The accompanying textbook is useful too, tho i havent personally read thru the whole thing. I think watching the lectures is pretty good, gets you about 60% of the way there.

answered Sep 30 '22 12:09

Chii

Related questions
                            
                                Is Jquery *compiler* possible?
                            
                                lightweight javascript to javascript parser
                            
                                Can a compliant Java compiler optimize this code?
                            
                                What does pointer reversal in mark and sweep garbage collection buy you?
                            
                                difference between -lgcc_s and gcc
                            
                                What extra optimisation does g++ do with -Ofast?
                            
                                C# Compiler Optimizations
                            
                                Reserved keywords in Objective-C?
                            
                                How Do C++ Compilers Merge Identical String Literals
                            
                                Learning C++ without an IDE
                            
                                Ada compilers for Linux
                            
                                How do I find how C++ compiler implements something except inspecting emitted machine code?
                            
                                In Java, the variable name can be same with the classname
                            
                                can I count on my compiler to optimize strlen on const char *?
                            
                                Is there any alternative to gcc to do pratical development under *nix? [closed]
                            
                                Why C program gives different result?
                            
                                How to include the boost library into a C++ program?
                            
                                How can I ignore GCC compiler 'pedantic' errors in external library headers?
                            
                                What does a compiled C++ class look like?
                            
                                how does a C-like compiler interpret the if statement

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Learning how programming languages work

Tags:

compiler-construction

interpreter

RexE

People also ask

4 Answers

Toon Krijthe

RCIX

Khaled Alshaya

Chii

Recent Activity

Donate For Us