Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Advice on Python Parser Generators

I've been given a task where I have to create a parser for a simple C-like language. I can use any programming language and tools I wish to create the parser, but I'm learning Python at the same time so it would be my preferred choice.

There are a few restrictions my Parser has to follow. Firstly, it must be able to read in a text file that contains the following information:

kind1 : spelling1
kind2 : spelling2
kind3 : spelling3
      .
      .
      .
kindn : spellingn

Where each kind and spelling refer to the token type and value of the language. This file is the result of putting a sample of code through the language's lexical analyser.

Secondly, I must be able to customise the output of the parser. Ideally I would like to output a file that has converted the kind:spelling list into another sequence of tokens that would be passed to the language's compiler to be converted into MIPS Assembly code. Here's a little example of the kind of thing I would like the parser to be able to produce:

%function int test
  %variable int x
  %variable int y
%begin
  %if %id y , %id x > %do
  %begin
    %return %num 0
  %end
  %return %num 1
%end

It would be a great help if someone could advise me on existing Python Parser Generators and if I'd be able to achieve the sort of thing I'm looking for in the above examples.

like image 269
greenie Avatar asked Nov 21 '09 17:11

greenie


People also ask

What is the best parser generator?

Java Compiler Compiler (JavaCC) is the most popular parser generator for use with Java applications. A parser generator is a tool that reads a grammar specification and converts it to a Java program that can recognize matches to the grammar.

What parser generator does Python use?

Pegen. Pegen is the parser generator used in CPython to produce the final PEG parser used by the interpreter. It is the program that can be used to read the python grammar located in Grammar/Python.


2 Answers

PyParsing is a python tool to generate parsers. There are a lot of interesting examples.

Easy to get started:

from pyparsing import Word, alphas

# define grammar
greet = Word( alphas ) + "," + Word( alphas ) + "!"

# input string
hello = "Hello, World!"

# parse input string
print hello, "->", greet.parseString( hello )
like image 113
miku Avatar answered Oct 04 '22 15:10

miku


I recommend that you check out Lark: https://github.com/erezsh/lark

It can parse ALL context-free grammars, it automatically builds an AST (with line & column numbers), and it accepts the grammar in EBNF format, which is simple to write and it's considered the standard.

like image 23
Erez Avatar answered Oct 04 '22 14:10

Erez