Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

get elements in "AND" in logic string with Python

I want to parse logic strings and get all the combinations of elements that are in an "and" logic. For instance, for the string '( A and ( B or C ) )' I should get [[A,B],[A,C]] and for the string '( A and B and ( C or D and F ) or F and G )' I should get [[A,B,C],[A,B,D,F],[F,G]].

I'm trying to use pyparsing. Following this post here parsing a complex logical expression in pyparsing in a binary tree fashion I manage to get a nested list with the letters grouped according to preferences ("and" has preference over "or", and parenthesis overrides this):

import pyparsing as pp

complex_expr = pp.Forward()
vars = pp.Word(pp.alphas, pp.alphanums + "_") | pp.Regex(r"[+-]?\d+(:?\.\d*)?(:?[eE][+-]?\d+)?").setName('proteins')
clause = pp.Group(vars ^ (pp.Suppress("(") + complex_expr + pp.Suppress(")") ))

expr = pp.operatorPrecedence(clause,[
                            ("and", 2, pp.opAssoc.LEFT, ),
                            ("or", 2, pp.opAssoc.LEFT, ),])
#print expr
complex_expr << expr
parseresult=complex_expr.parseString('( A and B and ( C or D and F ) or F and G )')
print parseresult

Which gives:

[[[[['A'], 'and', ['B'], 'and', [[['C'], 'or', [['D'], 'and', ['F']]]]], 'or', [['F'], 'and', ['G']]]]]

Now how can I process this result to achieve the desired output? I would be greateful for any help. I tried pyparsing but I'm open to other modules which may be better.

Thanks in advance.

like image 834
Rosa Avatar asked Oct 19 '22 07:10

Rosa


1 Answers

Python libraries are going to help us a little bit:

import re
import itertools

Let's write the required function:

def analyse(expression):
    # Find all used symbols
    symbols = set(re.findall(r"\b[A-Z]\b", expression))
    # Find all combinations of symbols and values
    mappings = (dict(zip(symbols, values)) for values in itertools.product([False, True], repeat=len(symbols)))
    # Select combinations that make the whole expression true
    solutions = [sorted(name for name in mapping if mapping[name]) for mapping in mappings if eval(expression, None, mapping)]
    # Filter out redundant solutions
    return sorted(s1 for s1 in solutions if not any(set(s1) > set(s2) for s2 in solutions))

And let's test it:

assert analyse("( A and ( B or C ) )") == [["A", "B"], ["A", "C"]]
assert analyse("( A and B and ( C or D and F ) or F and G )") == [["A", "B", "C"], ["A", "B", "D", "F"], ["F", "G"]]

There are comments in the source code. Anyway, the main steps are:

  • The expression variables are found as single-character uppercase names.
  • Each variable can be either True or False. We find all combinations.
  • We select only such combinations that make the whole expression True.
  • We keep only minimal solutions, i.e. those that are not supersets of other solutions.

I would like to thank you very much for a nice question. The Python's itertools never stop surprising me. ;-)

like image 72
dlask Avatar answered Oct 23 '22 10:10

dlask