Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is "parsing" a subset of "compiling"?

When I think of "compiling" I think of turning C++ code into a binary. Or perhaps C# into CLR byte code. But "parsing" could be something like parsing Python, or a web template language, where it doesn't need to produce any binaries, but can either execute the code immediately, statement-by-statement, or output HTML directly.

Would you basically be doing the same task in either case? Ignoring the language syntax, would compiling C++ be equally as difficult as parsing a website template file (Django, Smarty, whatever), or Python?

What I'm trying to allude at, is if I study "compiling" or read a book on "compiling" will I necessarily pick up the skills to parse non-compiled languages?

like image 203
mpen Avatar asked Nov 30 '10 20:11

mpen


People also ask

What is parsing and compiling?

What is Parsing in Compiler Design? The process of transforming the data from one format to another is called Parsing. This process can be accomplished by the parser. The parser is a component of the translator that helps to organise linear text structure following the set of defined rules which is known as grammar.

Is parser and compiler same?

A parser just reads a text into an internal, more abstract representation, often a tree or graph of some sort. A compiler translates such an internal representation into another format. Most often this means converting source code into executable programs. But the target doesn't have to be machine code.

What is a parsing process?

Parsing, which is the process of identifying tokens within a data instance and looking for recognizable patterns. The parsing process segregates each word, attempts to determine the relationship between the word and previously defined token sets, and then forms patterns from sequences of tokens.

What is parsing in system programming?

Parsing is known as Syntax Analysis. It contains arranging the tokens as source code into grammatical phases that are used by the compiler to synthesis output generally grammatical phases of the source code are defined by parse tree. There are various types of parsing techniques which are as follows − Top-Down Parser.


2 Answers

Short answer: parsing is not a subset of compiling.

Long answer: generally, there are a 3 steps to converting source to another format:

  1. Lexing, which converts some form of input to a token stream.
  2. Parsing, which converts the token stream into an abstract syntax tree (AST).
  3. Compiling, which converts the AST into a set of executable instructions (native code, byte code, etc).

(For very simple languages, you may not even need a parser, you might be able to compile the token stream directly, or your parser could output native code directly.)

So start with a raw string like this:

let x = 0
while x < 10
    print x
    x := x + 1

A lexer is going to convert it into a token stream, probably something like this:

[LET; String("x"); EQ; Int(0); NEWLINE; WHILE; String("x");
 LT; VAL(10); ... ]

The parser will convert the stream into a more meaningful data structure, your abstract syntax tree:

// AST definition
type expr =
    | Block of expr list
    | Assign of string * expr
    | While of expr * expr
    | Call of string * expr list
    | Add of expr * expr
    | Var of string
    | Int of int

// AST instance created from token stream
Block
    [
        Assign("x", Int(10));
        While
        (
            LessThan(Var("x"), Int(10)),
            Block
                [
                    Call("print", [Var("x")]);
                    Assign("x", Add(Var("x"), Int(1)));
                ]
        );
    ]

Once you have an AST, you can do whatever it wants with it:

  • You convert the AST to native code (compiling).
  • or you could interpret the AST on the fly, which you might do with a dynamic programming language or a templating engine.
  • or you could iterate the AST to make a syntax highlighter.
  • or you could walk the AST and output equivalent code in another language.
  • or you could look for all instances of Var("x") and replace them with Var("y") similar to a code refactor tool).

So, while you usually parse input before compiling, that's not the same as saying that parsing is a subset of compiling.

like image 134
Juliet Avatar answered Sep 27 '22 19:09

Juliet


No, parsing and compiling can be completely independent.

  • A parser may not be emitting any code at all. It could be parsing some data object (JSON, XML, whatever)
  • A compiler may not have source code to start with - it could be presented with an abstract syntax tree, already parsed, and just have to emit the relevant code

Most compilers include a parsing step, but I don't think it's necessarily a "subset" of compiling, and parsing certainly doesn't have to have anything to do with compilation.

like image 35
Jon Skeet Avatar answered Sep 27 '22 20:09

Jon Skeet