I am wondering if there is any way to get some meta information about the interpretation of a python statement during execution.
Let's assume this is a complex statement of some single statements joined with or (A, B, ... are boolean functions)
if A or B and ((C or D and E) or F) or G and H:
and I want to know which part of the statement is causing the statement to evaluate to True so I can do something with this knowledge. In the example, there would be 3 possible candidates:
A
B and ((C or D and E) or F)
G and H
And in the second case, I would like to know if it was (C or D and E) or F that evaluated to True and so on...
Is there any way without parsing the statement? Can I hook up to the interpreter in some way or utilize the inspect module in a way that I haven't found yet? I do not want to debug, it's really about knowing which part of this or-chain triggered the statement at runtime.
Edit - further information: The type of application that I want to use this in is a categorizing algorithm that inputs an object and outputs a certain category for this object, based on its attributes. I need to know which attributes were decisive for the category.
As you might guess, the complex statement from above comes from the categorization algorithm. The code for this algorithm is generated from a formal pseudo-code and contains about 3,000 nested if-elif-statements that determine the category in a hierarchical way like
if obj.attr1 < 23 and (is_something(obj.attr10) or eats_spam_for_breakfast(obj)):
return 'Category1'
elif obj.attr3 == 'Welcome Home' or count_something(obj) >= 2:
return 'Category2a'
elif ...
So aside from the category itself, I need to flag the attributes that were decisive for that category, so if I'd delete all other attributes, the object would still be assigned to the same category (due to the ors within the statements). The statements can be really long, up to 1,000 chars, and deeply nested. Every object can have up to 200 attributes.
Thanks a lot for your help!
Edit 2: Haven't found time in the last two weeks. Thanks for providing this solution, it works!
Could you recode your original code:
if A or B and ((C or D and E) or F) or G and H:
as, say:
e = Evaluator()
if e('A or B and ((C or D and E) or F) or G and H'):
...? If so, there's hope!-). The Evaluator class, upon __call__, would compile its string argument, then eval the result with (an empty real dict for globals, and) a pseudo-dict for locals that actually delegates the value lookups to the locals and globals of its caller (just takes a little black magic, but, not too bad;-) and also takes note of what names it's looked up. Given Python's and and or's short-circuiting behavior, you can infer from the actual set of names that were actually looked up, which one determined the truth value of the expression (or each subexpression) -- in an X or Y or Z, the first true value (if any) will be the last one looked up, and in a X and Y and Z, the first false one will.
Would this help? If yes, and if you need help with the coding, I'll be happy to expand on this, but first I'd like some confirmation that getting the code for Evaluator would indeed be solving whatever problem it is that you're trying to address!-)
Edit: so here's coding implementing Evaluator and exemplifying its use:
import inspect
import random
class TracingDict(object):
def __init__(self, loc, glob):
self.loc = loc
self.glob = glob
self.vars = []
def __getitem__(self, name):
try: v = self.loc[name]
except KeyError: v = self.glob[name]
self.vars.append((name, v))
return v
class Evaluator(object):
def __init__(self):
f = inspect.currentframe()
f = inspect.getouterframes(f)[1][0]
self.d = TracingDict(f.f_locals, f.f_globals)
def __call__(self, expr):
return eval(expr, {}, self.d)
def f(A, B, C, D, E):
e = Evaluator()
res = e('A or B and ((C or D and E) or F) or G and H')
print 'R=%r from %s' % (res, e.d.vars)
for x in range(20):
A, B, C, D, E, F, G, H = [random.randrange(2) for x in range(8)]
f(A, B, C, D, E)
and here's output from a sample run:
R=1 from [('A', 1)]
R=1 from [('A', 1)]
R=1 from [('A', 1)]
R=1 from [('A', 0), ('B', 1), ('C', 1)]
R=1 from [('A', 1)]
R=1 from [('A', 0), ('B', 0), ('G', 1), ('H', 1)]
R=1 from [('A', 1)]
R=1 from [('A', 1)]
R=1 from [('A', 0), ('B', 1), ('C', 1)]
R=1 from [('A', 1)]
R=1 from [('A', 0), ('B', 1), ('C', 1)]
R=1 from [('A', 1)]
R=1 from [('A', 1)]
R=1 from [('A', 1)]
R=0 from [('A', 0), ('B', 0), ('G', 0)]
R=1 from [('A', 1)]
R=1 from [('A', 1)]
R=1 from [('A', 1)]
R=0 from [('A', 0), ('B', 0), ('G', 0)]
R=1 from [('A', 0), ('B', 1), ('C', 1)]
You can see that often (about 50% of the time) A is true, which short-circuits everything. When A is false, B evaluates -- when B is also false, then G is next, when B is true, then C.
As far as I remember, Python does not return True or False per se:
Important exception: the Boolean operations
orandandalways return one of their operands.
The Python Standard Library - Truth Value Testing
Therefore, following is valid:
A = 1
B = 0
result = B or A # result == 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With