I already made a scanner, now I'm supposed to make a parser. What's the difference?
In the traditional arrangement, the parser calls the scanner whenever it needs a token. That's the same logic as used in the scanner (or many other programs) which call the I/O library every time they need more input.
Parsers are used when there is a need to represent input data from source code abstractly as a data structure so that it can be checked for the correct syntax. Coding languages and other technologies use parsing of some type for this purpose.
SUMMARY. The scanner is a subroutine which is frequently called by an application program like a compiler. The primary function of a scanner is to combine characters from the input stream into recognizable units called tokens.
A parser just reads a text into an internal, more abstract representation, often a tree or graph of some sort. A compiler translates such an internal representation into another format. Most often this means converting source code into executable programs. But the target doesn't have to be machine code.
A Scanner simply turns an input String (say a file) into a list of tokens. These tokens represent things like identifiers, parentheses, operators etc.
A parser converts this list of tokens into a Tree-like object to represent how the tokens fit together to form a cohesive whole (sometimes referred to as a sentence).
In terms of programming language parsers, the output is usually referred to as an Abstract Syntax Tree (AST). Each node in the AST represents a different construct of the language, e.g. an IF statement would be a node with 2 or 3 sub nodes, a CONDITION node, a THEN node and potentially an ELSE node.
A parser does not give the nodes any meaning beyond structural cohesion. The next thing to do is extract meaning from this structure (sometimes called contextual analysis).
Parsing (in a general sense) is about turning the symbols (characters, digits, left parens, etc) into sentences of your grammar.
The lexical analyzer (the "lexer") parses individual symbols from the source code file into tokens. From there, the "parser" proper turns those whole tokens into sentences of your grammar.
Put another way, the lexer combines symbols into tokens, and the parser combines tokens to form sentences.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With