Which Python tool can you recommend to parse programming languages? It should allow for a readable representation of the language grammar inside the source, and it should be able to scale to complicated languages (something with a grammar as complex as e.g. Python itself).
When I search, I mostly find pyparsing, which I will be evaluating, but of course I'm interested in other alternatives.
Edit: Bonus points if it comes with good error reporting and source code locations attached to syntax tree elements.
Making experiments. As the generated C parser is the one used by Python, this means that if something goes wrong when adding some new rules to the grammar you cannot correctly compile and execute Python anymore.
A parser is a software component that takes input data (frequently text) and builds a data structure – often some kind of parse tree, abstract syntax tree or other hierarchical structure, giving a structural representation of the input while checking for correct syntax.
I really like pyPEG. Its error reporting isn't very friendly, but it can add source code locations to the AST.
pyPEG doesn't have a separate lexer, which would make parsing Python itself hard (I think CPython recognises indent and dedent in the lexer), but I've used pyPEG to build a parser for subset of C# with surprisingly little work.
An example adapted from fdik.org/pyPEG/: A simple language like this:
function fak(n) {     if (n==0) { // 0! is 1 by definition         return 1;     } else {         return n * fak(n - 1);     }; }   A pyPEG parser for that language:
def comment():          return [re.compile(r"//.*"),                                 re.compile("/\*.*?\*/", re.S)] def literal():          return re.compile(r'\d*\.\d*|\d+|".*?"') def symbol():           return re.compile(r"\w+") def operator():         return re.compile(r"\+|\-|\*|\/|\=\=") def operation():        return symbol, operator, [literal, functioncall] def expression():       return [literal, operation, functioncall] def expressionlist():   return expression, -1, (",", expression) def returnstatement():  return keyword("return"), expression def ifstatement():      return (keyword("if"), "(", expression, ")", block,                                 keyword("else"), block) def statement():        return [ifstatement, returnstatement], ";" def block():            return "{", -2, statement, "}" def parameterlist():    return "(", symbol, -1, (",", symbol), ")" def functioncall():     return symbol, "(", expressionlist, ")" def function():         return keyword("function"), symbol, parameterlist, block def simpleLanguage():   return function 
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With