Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

the trick to nested structures in pyparsing

I am struggling to parse nested structures with PyParsing. I've searched many of the 'nested' example uses of PyParsing, but I don't see how to fix my problem.

Here is what my internal structure looks like:

texture_unit optionalName
{
    texture required_val
    prop_name1 prop_val1
    prop_name2 prop_val1
}

and here is what my external structure looks like, but it can contain zero or more of the internal structures.

pass optionalName
{
    prop_name1 prop_val1
    prop_name2 prop_val1

    texture_unit optionalName
    {
        // edit 2: showing use of '.' character in value
        texture required_val.file.name optional_val // edit 1: forgot this line in initial post.

        // edit 2: showing potentially multiple values
        prop_name3 prop_val1 prop_val2
        prop_name4 prop_val1
    }
}

I am successfully parsing the internal structure. Here is my code for that.

prop_ = pp.Group(pp.Word(pp.alphanums+'_')+pp.Group(pp.OneOrMore(pp.Word(pp.alphanums+'_'+'.'))))
texture_props_ = pp.Group(pp.Literal('texture') + pp.Word(pp.alphanums+'_'+'.')) + pp.ZeroOrMore(prop_)
texture_ = pp.Forward()
texture_ << pp.Literal('texture_unit').suppress() + pp.Optional(pp.Word(pp.alphanums+'_')).suppress() + pp.Literal('{').suppress() + texture_props_ + pp.Literal('}').suppress()

Here is my attempt to parse the outer structure,

pass_props_ = pp.ZeroOrMore(prop_)
pass_ = pp.Forward()
pass_ << pp.Literal('pass').suppress() + pp.Optional(pp.Word(pp.alphanums+'_'+'.')).suppress() + pp.Literal('{').suppress() + pass_props_ + pp.ZeroOrMore(texture_) + pp.Literal('}').suppress()

When I say: pass_.parseString( testPassStr )

I see errors in the console that "}" was expected.

I see this as very similar to the C struct example, but I'm not sure what is the missing magic. I'm also curious how to control the resulting data structure when using the nestedExpr.

like image 652
cyrf Avatar asked May 24 '14 03:05

cyrf


1 Answers

There are two problems:

  1. In your grammar you marked texture literal as required in texture_unit block, but there is no texture in your second example.
  2. In second example, pass_props_ coincides with texture_unit optionalName. After it, pp.Literal('}') expects }, but gives {. This is the reason for the error.

We can check it by changing the pass_ rule like this:

pass_ << pp.Literal('pass').suppress() + pp.Optional(pp.Word(pp.alphanums+'_'+'.')).suppress() + \
             pp.Literal('{').suppress() + pass_props_

print pass_.parseString(s2)

It gives us follow output:

[['prop_name', ['prop_val', 'prop_name', 'prop_val', 'texture_unit', 'optionalName']]]

We can see that pass_props_ coincides with texture_unit optionalName.
So, what we want to do: prop_ can contains alphanums, _ and ., but can not match with texture_unit literal. We can do it with regex and negative lookahead:

prop_ = pp.Group(  pp.Regex(r'(?!texture_unit)[a-z0-9_]+')+ pp.Group(pp.OneOrMore(pp.Regex(r'(?!texture_unit)[a-z0-9_.]+'))) )

Finally, working example will look like this:

import pyparsing as pp

s1 = '''texture_unit optionalName
    {
    texture required_val
    prop_name prop_val
    prop_name prop_val
}'''

prop_ = pp.Group(  pp.Regex(r'(?!texture_unit)[a-z0-9_]+')+ pp.Group(pp.OneOrMore(pp.Regex(r'(?!texture_unit)[a-z0-9_.]+'))) )
texture_props_ = pp.Group(pp.Literal('texture') + pp.Word(pp.alphanums+'_'+'.')) + pp.ZeroOrMore(prop_)
texture_ = pp.Forward()
texture_ = pp.Literal('texture_unit').suppress() + pp.Word(pp.alphanums+'_').suppress() +\
           pp.Literal('{').suppress() + pp.Optional(texture_props_) + pp.Literal('}').suppress()

print texture_.parseString(s1)

s2 = '''pass optionalName
{
    prop_name1 prop_val1.name
    texture_unit optionalName1
    {
        texture required_val1
        prop_name2 prop_val12
        prop_name3 prop_val13
    }
    texture_unit optionalName2
    {
        texture required_va2l
        prop_name2 prop_val22
        prop_name3 prop_val23
    }
}'''

pass_props_ = pp.ZeroOrMore(prop_  )
pass_ = pp.Forward()

pass_ = pp.Literal('pass').suppress() + pp.Optional(pp.Word(pp.alphanums+'_'+'.')).suppress() +\
        pp.Literal('{').suppress() + pass_props_ + pp.ZeroOrMore(texture_ ) + pp.Literal('}').suppress()

print pass_.parseString(s2)

Output:

[['texture', 'required_val'], ['prop_name', ['prop_val', 'prop_name', 'prop_val']]]
[['prop_name1', ['prop_val1.name']], ['texture', 'required_val1'], ['prop_name2', ['prop_val12', 'prop_name3', 'prop_val13']], ['texture', 'required_va2l'], ['prop_name2', ['prop_val22', 'prop_name3', 'prop_val23']]]
like image 137
NorthCat Avatar answered Sep 28 '22 01:09

NorthCat