I wonder how is generated the grammar of the Python language and how it is understood by the interpreter.
In python, the file graminit.c
seems to implement the grammar, but i don't clearly understand it.
More broadly, what are the different ways to generate a grammar and are there differences between how the grammar is implemented in languages such as Perl, Python or Lua.
Grammars are generally of the same form: Backus-Naur Form (BNF) is typical.
Lexer/parsers can take very different forms.
The lexer breaks up the input file into tokens. The parser uses the grammar to see if the stream of tokens is "valid" according to its rules.
Usually the outcome is an abstract syntax tree (AST) that can then be used to generate whatever you want, such as byte code or assembly.
There are many ways to implement lexing/parsing, it really comes down to identifing the patterns and how they fit together. There are a few very nice Python packages for doing this that range from pure python to wrapped C code. Pyparsing in-particular has many excellent examples. One thing worth noting, finding a straight EBNF/BNF parser is kind of hard -- writing a parser with Python code isn't awful but it is one step further from the raw grammar which might be important to you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With