Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Non greedy parsing with pyparsing

I'm trying to parse a line with pyparsing. This line is composed of a number of (key, values). What I'd like to get is a list of (key, values). A simple example:

ids = 12 fields = name

should result in something like: [('ids', '12'), ('fields', 'name')]

A more complex example:

ids = 12, 13, 14 fields = name, title

should result in something like: [('ids', '12, 13, 14'), ('fields', 'name, title')]

PS: the tuple inside the resulting list is just an example. It could be a dict or another list or whatever, it's not that important.

But whatever I've tried up to now I get results like: [('ids', '12 fields')]

Pyparsing is eating the next key, considering it's also part of the value.

Here is a sample code:

import pyparsing as P

key = P.oneOf("ids fields")
equal = P.Literal('=')
key_equal = key + equal
val = ~key_equal + P.Word(P.alphanums+', ')

gr = P.Group(key_equal+val)
print gr.parseString("ids = 12 fields = name")

Can someone help me ? Thanks.

like image 776
Oli Avatar asked Aug 12 '11 08:08

Oli


1 Answers

The first problem lies in this line:

val = ~key_equal + P.Word(P.alphanums+', ')

It suggests that the part matches any alphanumeric sequence, followed by the literal ', ', but instead it matches any sequence of alphanumeric characters, ',' and ' '.

What you'd want instead is:

val = ~key_equal + P.delimitedList(P.Word(P.alphanums), ", ", combine=True)

The second problem is that you only parse one key-value pair:

gr = P.Group(key_equal+val)

Instead, you should parse as many as possible:

gr = P.Group(P.OneOrMore(key_equal+val))

So the correct solution is:

>>> import pyparsing as P
>>> key = P.oneOf("ids fields")
>>> equal = P.Literal('=')
>>> key_equal = key + equal
>>> val = ~key_equal + P.delimitedList(P.Word(P.alphanums), ", ", combine=True)
>>> gr = P.OneOrMore(P.Group(key_equal+val))
>>> print gr.parseString("ids = 12, 13, 14 fields = name, title")
[['ids', '=', '12, 13, 14'], ['fields', '=', 'name, title']]
like image 114
blubb Avatar answered Oct 04 '22 20:10

blubb