I would like to use the excellent pyparsing package to parse a python function call in its most general form. I read one post that was somewhat useful here but still not general enough.
I would like to parse the following expression:
f(arg1,arg2,arg3,...,kw1=var1,kw2=var2,kw3=var3,...)
where
I was wondering if a grammar could be defined for such a general template. I am perhaps asking too much ... Would you have any idea ?
thank you very much for your help
Eric
Is that all? Let's start with a simple informal BNF for this:
func_call ::= identifier '(' func_arg [',' func_arg]... ')'
func_arg ::= named_arg | arg_expr
named_arg ::= identifier '=' arg_expr
arg_expr ::= identifier | real | integer | dict_literal | list_literal | tuple_literal | func_call
identifier ::= (alpha|'_') (alpha|num|'_')*
alpha ::= some letter 'a'..'z' 'A'..'Z'
num ::= some digit '0'..'9'
Translating to pyparsing, work bottom-up:
identifier = Word(alphas+'_', alphanums+'_')
# definitions of real, integer, dict_literal, list_literal, tuple_literal go here
# see further text below
# define a placeholder for func_call - we don't have it yet, but we need it now
func_call = Forward()
string = pp.quotedString | pp.unicodeString
arg_expr = identifier | real | integer | string | dict_literal | list_literal | tuple_literal | func_call
named_arg = identifier + '=' + arg_expr
# to define func_arg, must first see if it is a named_arg
# why do you think this is?
func_arg = named_arg | arg_expr
# now define func_call using '<<' instead of '=', to "inject" the definition
# into the previously declared Forward
#
# Group each arg to keep its set of tokens separate, otherwise you just get one
# continuous list of parsed strings, which is almost as worthless the original
# string
func_call << identifier + '(' + delimitedList(Group(func_arg)) + ')'
Those arg_expr
elements could take a while to work through, but fortunately, you can get them off the pyparsing wiki's Examples page: http://pyparsing.wikispaces.com/file/view/parsePythonValue.py
from parsePythonValue import (integer, real, dictStr as dict_literal,
listStr as list_literal, tupleStr as tuple_literal)
You still might get args passed using *list_of_args
or **dict_of_named_args
notation. Expand arg_expr
to support these:
deref_list = '*' + (identifier | list_literal | tuple_literal)
deref_dict = '**' + (identifier | dict_literal)
arg_expr = identifier | real | integer | dict_literal | list_literal | tuple_literal | func_call | deref_list | deref_dict
Write yourself some test cases now - start simple and work your way up to complicated:
sin(30)
sin(a)
hypot(a,b)
len([1,2,3])
max(*list_of_vals)
Additional argument types that will need to be added to arg_expr
(left as further exercise for the OP):
indexed arguments : dictval['a']
divmod(10,3)[0]
range(10)[::2]
object attribute references : a.b.c
arithmetic expressions : sin(30)
, sin(a+2*b)
comparison expressions : sin(a+2*b) > 0.5
10 < a < 20
boolean expressions : a or b and not (d or c and b)
lambda expression : lambda x : sin(x+math.pi/2)
list comprehension
generator expression
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With