Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In an interpreter, what (usually) comes after the lexer?

For a programming language interpreter, I am wondering about the sequence of events that the interpreter goes through. For instance, I think this is how it goes:

  • Interpreter gets some input
  • The lexer/tokenizer gets the input and demarcates the tokens
  • The x gets the list of tokens
  • ???
  • The code gets executed

What step(s) belongs in the ??? spot, and what goes in the place of x (that is, what recieves and operates on the tokens that the lexer has produced)?

like image 277
Seth Carnegie Avatar asked Oct 28 '25 15:10

Seth Carnegie


1 Answers

I'll start by recommending the classic and free book: Structure and Interpretation of Computer Programs (video lectures)

Lisp is the baseline interpreter and everything else is a in some ways a variation on the theme.

In general the steps are:

  • Lexical analysis take a char stream and produces tokens
  • Parsing takes the tokens (a flat list) and builds a data structure called an Abstract Syntax Tree (AST). This step can be very easy (Lisp) or amazingly complex (C++, Ruby).
  • Evaluate the AST. The details are a bit different but this is pretty much a depth first walk down the tree. Leaves are data (numbers, strings, constants, variables) nodes are either primitive functions (math, data manipulation, control structures) or higher level compound functions. Each node should reduce to something that can be fed directly into the node above it.

This last step is "code gets executed." For a compiled or Just-In-Time (JIT) language the last step is instead translating the AST back into machine instructions. It is also important to note two other steps that may be present. One is a translation into a simpler language like c--, LLVM, .NET, or Java bitecode. The other is desugaring and/or optimization that can happen between the parser and the evaluator. Haskell for example is somewhat infamous for the huge amount of desugaring that goes on.

I'll end by encouraging you to try one of the many walk-throughs for writing a Scheme (a dialect of Lisp) interpreter. There is likely one for your favorite language on the net somewhere.

like image 100
John F. Miller Avatar answered Oct 30 '25 15:10

John F. Miller