Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python implementation of Parsec?

I recently wrote a parser in Python using Ply (it's a python reimplementation of yacc). When I was almost done with the parser I discovered that the grammar I need to parse requires me to do some look up during parsing to inform the lexer. Without doing a look up to inform the lexer I cannot correctly parse the strings in the language.

Given than I can control the state of the lexer from the grammar rules I think I'll be solving my use case using a look up table in the parser module, but it may become too difficult to maintain/test. So I want to know about some of the other options.

In Haskell I would use Parsec, a library of parsing functions (known as combinators). Is there a Python implementation of Parsec? Or perhaps some other production quality library full of parsing functionality so I can build a context sensitive parser in Python?

EDIT: All my attempts at context free parsing have failed. For this reason, I don't expect ANTLR to be useful here.

like image 577
Jason Dagit Avatar asked Sep 18 '08 17:09

Jason Dagit


4 Answers

I believe that pyparsing is based on the same principles as parsec.

like image 173
Peter Hart Avatar answered Oct 15 '22 04:10

Peter Hart


PySec is another monadic parser, I don't know much about it, but it's worth looking at here

like image 20
rcreswick Avatar answered Oct 15 '22 04:10

rcreswick


An option you may consider, if an LL parser is ok to you, is to give ANTLR a try, it can generate python too (actually it is LL(*) as they name it, * stands for the quantity of lookahead it can cope with).

like image 5
PW. Avatar answered Oct 15 '22 05:10

PW.


Nothing prevents you for diverting your parser from the "context free" path using PLY. You can pass information to the lexer during parsing, and in this way achieve full flexibility. I'm pretty sure that you can parse anything you want with PLY this way.

For a hands-on example, consider - it is a parser for ANSI C written in Python with PLY. It solves the classic C typedef - identifier problem (that makes C's grammar non context-sensitive) by populating a symbol table in the parser that is being used in the lexer to resolve symbol names as either types or not.

like image 2
Eli Bendersky Avatar answered Oct 15 '22 05:10

Eli Bendersky