I want to generate in Python all possible RPN (Reverse Polish notation) expressions, that use letters from an input list (such as <code>['a', 'b', 'c']</code>) and contain operators <code>['+', '-', '*', '/']</code>. My idea was that we can add elements to the current expression until one of the following happens: either we've used all letters or expression is complete (i.e. we can't add more operators). So I wrote the following functions: 1) <pre class="prettyprint"><code>''' The function returns True if we can add operator to current expression: we scan the list and add +1 to counter when we meet a letter and we add -1 when we meet an operator (it reduces last two letters into 1 (say ab+ <--> a + b) ''' def can_add_operator(string_): n = 0 for letter in string_: if letter not in ['+', '-', '*', '/']: n += 1 else: n -= 1 result = n > 1 return result can_add_operator('ab*c+')) #False can_add_operator('ab*c') #True </code></pre> 2) <pre class="prettyprint"><code>''' The function returns a list, that contains operators and letters, one of which we can add to the current expression. ''' def possible_elements(items, string_): #list of all letters that we have not used yet result = [x for x in items if x not in string_] if can_add_operator(string_): result = ["+", "-", "*", "/"] + result return result letters = ['a', 'b', 'c', 'd'] possible_elements(letters, 'ab*c') #['+', '-', '*', '/', 'd'] possible_elements(letters, '') #['a', 'b', 'c', 'd'] possible_elements(letters, 'ab*c+d+') #[] </code></pre> 3) Next I wrap it into recursion: <pre class="prettyprint"><code>#exp -- current expression, base of recursion is exp = '' def rec(exp, Final_sol = []): elements_to_try = possible_elements(letters, exp) for i in elements_to_try: if len(possible_elements(letters, exp + i)) == 0: Final_sol.append(exp + i) else: rec(exp+i, Final_sol) return Final_sol #we start with an empty string Final_sol = rec('') print(len(Final_sol)) #7680 </code></pre> There are two difficulties with that function: <ol> <li> The first one is that if there are repeated letters in list of letters, it won't return all possible results. For example if <code>letters = ['a', 'b', 'a']</code>: <pre class="prettyprint"><code>Final_sol = rec('') print(len(Final_sol)) #32 str(Final_sol) >> "['ab+', 'ab-', 'ab*', 'ab/', 'ba+', 'ba-', 'ba*', 'ba/', 'ba+', 'ba-', 'ba*', 'ba/', 'ab+', 'ab-', 'ab*', 'ab/', 'ab+', 'ab-', 'ab*', 'ab/', 'ba+', 'ba-', 'ba*', 'ba/', 'ba+', 'ba-', 'ba*', 'ba/', 'ab+', 'ab-', 'ab*', 'ab/']" </code></pre> So the output missing <code>'ab+a+'</code> and so on. But I do want to return all possible combinations in that case too. </li> <li>The second problem is that there are many "equivalent" strings in the output. Since we have the commutative and associative properties in prefix form, expressions like <code>ab+c+</code>/<code>abc++</code>/<code>ca+b+</code> should be considered as equivalent: I want only one of each group in the output of the function rec().</li> </ol> How could we change the functions above to overcome such difficulties? What is the most elegant solution to problem?

<blockquote> The first one is that if there are repeated letters in list of letters, it won't return all possible results. </blockquote> We can attack this problem by using a different approach to generate the permutations: <pre class="prettyprint"><code>from itertools import permutations variables = ['a', 'a', 'b', 'c'] operators = ['+', '-', '*', '/'] equations = set() for permutation in permutations(variables): a, b, *rest = permutation operations = permutations(operators) for permutation in operations: equation = zip([a + b, *rest], permutation) equations.add("".join(variable + operator for variable, operator in equation)) </code></pre> Using a <code>set()</code> will eliminate any duplications caused by repeated variables. <blockquote> The second problem is that there are many "equivalent" strings in the output. Since we have the commutative and associative properties </blockquote> To deal with the commutative issue, we'll use pattern matching to reduce the equations: <pre class="prettyprint"><code>import sys import re DEBUG = True remove = set() # Reduce commutative equivalents: ca*a-b/ same as ac*a-b/ if DEBUG: print("Reduce commutative equivalents:", file=sys.stderr) for equation in equations: if equation not in remove: for match in re.finditer(r"(?=(.+)(\w)[+*])", equation): a, _ = match.span(1) _, d = match.span(2) equivalent = equation[:a] + match[2] + match[1] + equation[d:] if equivalent != equation and equivalent in equations: remove.add(equivalent) if DEBUG: print(f"Removed {equivalent} same as {equation}", file=sys.stderr) equations -= remove </code></pre> Because we've built all equations as ab op c op d op, etc. I don't believe we generate the associative equivalents, but if we did, we could use a similar technique to try to thin them out: <pre class="prettyprint"><code>remove = set() # Reduce associative equivalents aa+b*c- same as ab*ab*+c- if DEBUG: print("Reduce associative equivalents:", file=sys.stderr) for equation in equations: if equation not in remove: for match in re.finditer(r"(?=(\w)([+])(\w)([*]))", equation): a, _ = match.span(1) _, d = match.span(4) equivalent = equation[:a] + match[3] + match[4] + match[1] + match[3] + match[4] + match[2] + equation[d:] if equivalent != equation and equivalent in equations: remove.add(equivalent) if DEBUG: print(f"Removed {equivalent} same as {equation}", file=sys.stderr) equations -= remove </code></pre> And finally dump out our reduced set: <pre class="prettyprint"><code>if DEBUG: print("Final equations:", file=sys.stderr) print(equations) </code></pre> OUTPUT <pre class="prettyprint"><code>> python3 test.py Reduce commutative equivalents: Removed ac+a-b/ same as ca+a-b/ Removed ab*a/c- same as ba*a/c- Removed cb*a/a- same as bc*a/a- Removed ac+b-a/ same as ca+b-a/ Removed ba+c/a- same as ab+c/a- Removed ba+a-c/ same as ab+a-c/ Removed ac+a/b- same as ca+a/b- Removed ac+b/a- same as ca+b/a- Removed ac*b-a/ same as ca*b-a/ Removed bc*a-a/ same as cb*a-a/ Removed ca*a-b/ same as ac*a-b/ Removed ba*a-c/ same as ab*a-c/ Removed cb+a/a- same as bc+a/a- Removed ba+c-a/ same as ab+c-a/ Removed ca*a/b- same as ac*a/b- Removed ca*b/a- same as ac*b/a- Removed ba+a/c- same as ab+a/c- Removed ab*c-a/ same as ba*c-a/ Removed ab*c/a- same as ba*c/a- Removed cb+a-a/ same as bc+a-a/ Reduce associative equivalents: Final equations: {'ca+a-b/', 'cb*a+a-', 'aa/b-c*', 'ba/c-a*', 'cb/a-a*', 'ab+a*c/', 'aa/c+b-', 'bc/a-a+', 'aa*b+c-', 'ba*a/c-', 'ab+c/a*', 'ca-a/b+', 'ca-b+a*', 'bc*a/a-', 'bc/a+a*', 'ac+a/b*', 'bc+a*a-', 'ca/a-b+', 'ac-a*b+', 'ba-a*c/', 'ac/b-a*', 'ba-c+a*', 'ba+a-c*', 'aa+b/c-', 'ca-b*a/', 'ca+b-a/', 'ab+c/a-', 'ac*b+a-', 'aa+c-b/', 'aa*c/b-', 'ab/c*a+', 'ac+b/a*', 'aa+b*c/', 'ab-a*c+', 'ac+a-b*', 'cb-a+a*', 'cb*a/a+', 'ab-c/a+', 'ac*b+a/', 'ba*c/a+', 'ba/c+a*', 'aa-b*c+', 'aa/b+c*', 'ab-c*a+', 'ac+a*b/', 'ac/b+a-', 'aa*b-c+', 'ac-a+b/', 'aa-c*b+', 'ab+a-c/', 'aa-c+b/', 'ba+c*a/', 'ca-b*a+', 'ab-a/c*', 'aa-b/c+', 'ac*a+b/', 'ba/a+c-', 'ba-c/a+', 'cb/a+a*', 'ca+b/a*', 'aa/c*b+', 'ac-a+b*', 'ba-a+c*', 'ca+a*b/', 'aa+b/c*', 'aa/c-b+', 'bc*a/a+', 'ca+a/b-', 'ca+b/a-', 'ca*b-a/', 'ac/b*a-', 'aa*b/c+', 'ba/a*c+', 'bc/a*a+', 'ca-b+a/', 'ac/b+a*', 'aa*b/c-', 'bc-a+a/', 'ca/b-a*', 'ba-c*a/', 'cb*a-a/', 'ba-c/a*', 'aa*b+c/', 'ac*a-b/', 'ca*b/a+', 'aa+b-c*', 'ba/a-c*', 'ca-b/a+', 'ab/c-a+', 'cb+a/a*', 'aa-c/b*', 'ba+c*a-', 'cb*a+a/', 'aa*c/b+', 'ab/c+a*', 'ca+b-a*', 'aa+b-c/', 'ac-b*a/', 'ab*a-c/', 'ba-a*c+', 'ba*c+a-', 'bc/a*a-', 'ba*c-a+', 'ba/c*a+', 'ab-c+a/', 'ba*c+a/', 'ca*a-b+', 'bc+a/a-', 'aa+c*b-', 'ab+c*a-', 'ac-a/b+', 'ca+a-b*', 'aa+c-b*', 'ab/c*a-', 'ab+c-a/', 'bc+a/a*', 'ac-a/b*', 'ab/a-c*', 'ac/a-b+', 'bc-a/a+', 'ab+a*c-', 'ac/a-b*', 'ca*a+b-', 'ab/a-c+', 'ab-a*c/', 'cb/a*a-', 'ac/a+b*', 'bc-a/a*', 'ac-b+a*', 'ac*a/b-', 'ba*a+c-', 'ba/a-c+', 'bc/a+a-', 'aa/b-c+', 'cb+a-a*', 'ca-b/a*', 'ca+b*a-', 'ac*b/a-', 'ca-a+b/', 'ca/b*a-', 'ba+a/c*', 'cb-a*a+', 'ac+a*b-', 'aa*b-c/', 'aa*c-b/', 'ac/a*b+', 'aa-c+b*', 'ca*a+b/', 'ca/b+a-', 'ac*a/b+', 'aa+c/b-', 'ab/c+a-', 'ab+a/c-', 'cb-a+a/', 'ab*a-c+', 'ab-a+c*', 'ab+a/c*', 'ac/b-a+', 'ab*c+a/', 'ba/c+a-', 'ba/c*a-', 'cb-a*a/', 'ac+b*a-', 'ba+c-a*', 'ac/b*a+', 'cb/a*a+', 'cb-a/a+', 'bc*a+a/', 'ac*b/a+', 'cb+a*a-', 'ba*c-a/', 'ca-a*b/', 'ca-a*b+', 'ab/a*c-', 'ba-a+c/', 'ba*a/c+', 'bc-a+a*', 'ca+a/b*', 'ca*a/b+', 'aa*c+b-', 'ba*c/a-', 'bc/a-a*', 'ca/a+b*', 'ab-a+c/', 'ca/b*a+', 'ab-a/c+', 'cb*a-a+', 'aa-b/c*', 'ac-b/a+', 'aa*c-b+', 'ab*c+a-', 'cb/a-a+', 'ab/a+c*', 'ba+a*c-', 'ba*a+c/', 'ba-a/c*', 'aa/b+c-', 'ba/c-a+', 'ca/b-a+', 'ab*a/c+', 'bc+a-a*', 'bc*a-a+', 'ab+c*a/', 'ab-c*a/', 'ac*a+b-', 'ca/a+b-', 'ac/a*b-', 'ac+b-a*', 'ba/a+c*', 'ba-a/c+', 'ab*c/a+', 'cb/a+a-', 'ca/a-b*', 'ac-b/a*', 'ab/a*c+', 'ca*b+a/', 'ac-a*b/', 'aa/b*c+', 'aa/c-b*', 'ca/a*b+', 'bc-a*a/', 'ca+b*a/', 'aa*c+b/', 'ab*a+c/', 'bc+a*a/', 'ab-c/a*', 'ca-a+b*', 'aa-c*b/', 'cb-a/a*', 'aa+b*c-', 'ca+a*b-', 'aa-b+c*', 'ac/a+b-', 'ba-c+a/', 'ba-c*a+', 'ca*b-a+', 'ac-b+a/', 'aa-b*c/', 'aa-b+c/', 'ac*a-b+', 'ac+b*a/', 'ca/a*b-', 'bc+a-a/', 'bc-a*a+', 'ba+a*c/', 'ac*b-a+', 'aa/c+b*', 'ab/a+c-', 'ab/c-a*', 'ab-c+a*', 'ba+c/a*', 'ab*c-a+', 'ab+a-c*', 'cb+a*a/', 'ac-b*a+', 'ba/a*c-', 'ab*a+c-', 'ab+c-a*', 'bc*a+a-', 'aa/b*c-', 'ca*b+a-', 'ba*a-c+', 'ca/b+a*', 'aa-c/b+', 'aa+c/b*', 'ca-a/b*', 'aa/c*b-', 'aa+c*b/'} > </code></pre> I'm not claiming a perfect solution, just illustrating some of the tools available to you to solve your problem.

Generating all possible "unique" RPN (Reverse Polish notation) expressions

Tags:

python

algorithm

recursion

postfix-notation

rpn

I want to generate in Python all possible RPN (Reverse Polish notation) expressions, that use letters from an input list (such as ['a', 'b', 'c']) and contain operators ['+', '-', '*', '/'].

My idea was that we can add elements to the current expression until one of the following happens: either we've used all letters or expression is complete (i.e. we can't add more operators).

So I wrote the following functions:

'''
The function returns True if we can add operator to current expression:
we scan the list and add +1 to counter when we meet a letter
and we add -1 when we meet an operator (it reduces
last two letters into 1 (say ab+ <--> a + b)
''' 
def can_add_operator(string_):
    n = 0
    for letter in string_:
        if letter not in ['+', '-', '*', '/']:
            n += 1
        else:
            n -= 1
    result = n > 1
    return result


    can_add_operator('ab*c+')) #False
    can_add_operator('ab*c')  #True

'''
The function returns a list, that contains operators and
letters, one of which we can add to the current expression.
'''

def possible_elements(items, string_):
    #list of all letters that we have not used yet
    result =  [x for x in items if x not in string_]
    if can_add_operator(string_):
        result = ["+", "-", "*", "/"] + result
    return result

letters = ['a', 'b', 'c', 'd']
possible_elements(letters, 'ab*c') #['+', '-', '*', '/', 'd']
possible_elements(letters, '') #['a', 'b', 'c', 'd']
possible_elements(letters, 'ab*c+d+') #[]

3) Next I wrap it into recursion:

#exp -- current expression, base of recursion is exp = ''
def rec(exp, Final_sol = []):
    elements_to_try = possible_elements(letters, exp)
    for i in elements_to_try:
        if len(possible_elements(letters, exp + i)) == 0:
            Final_sol.append(exp + i)
        else:
            rec(exp+i, Final_sol)
    return Final_sol

#we start with an empty string
Final_sol = rec('')
print(len(Final_sol)) #7680

There are two difficulties with that function:

The first one is that if there are repeated letters in list of letters, it won't return all possible results.

For example if letters = ['a', 'b', 'a']:

Final_sol = rec('')
print(len(Final_sol)) #32
str(Final_sol)
>> "['ab+', 'ab-', 'ab*', 'ab/', 'ba+', 'ba-', 'ba*', 'ba/', 'ba+', 'ba-', 
'ba*', 'ba/', 'ab+', 'ab-', 'ab*', 'ab/', 'ab+', 'ab-', 'ab*', 'ab/', 'ba+', 
'ba-', 'ba*', 'ba/', 'ba+', 'ba-', 'ba*', 'ba/', 'ab+', 'ab-', 'ab*', 
 'ab/']"

So the output missing 'ab+a+' and so on. But I do want to return all possible combinations in that case too.

The second problem is that there are many "equivalent" strings in the output. Since we have the commutative and associative properties in prefix form, expressions like ab+c+/abc++/ca+b+ should be considered as equivalent: I want only one of each group in the output of the function rec().

How could we change the functions above to overcome such difficulties? What is the most elegant solution to problem?

205

asked Jan 27 '19 00:01

Dima

2 Answers

The first one is that if there are repeated letters in list of letters, it won't return all possible results.

We can attack this problem by using a different approach to generate the permutations:

from itertools import permutations

variables = ['a', 'a', 'b', 'c']

operators = ['+', '-', '*', '/']

equations = set()

for permutation in permutations(variables):
    a, b, *rest = permutation

    operations = permutations(operators)

    for permutation in operations:

        equation = zip([a + b, *rest], permutation)

        equations.add("".join(variable + operator for variable, operator in equation))

Using a set() will eliminate any duplications caused by repeated variables.

The second problem is that there are many "equivalent" strings in the output. Since we have the commutative and associative properties

To deal with the commutative issue, we'll use pattern matching to reduce the equations:

import sys
import re

DEBUG = True

remove = set()

# Reduce commutative equivalents: ca*a-b/ same as ac*a-b/
if DEBUG:
    print("Reduce commutative equivalents:", file=sys.stderr)

for equation in equations:
    if equation not in remove:
        for match in re.finditer(r"(?=(.+)(\w)[+*])", equation):

            a, _ = match.span(1)
            _, d = match.span(2)

            equivalent = equation[:a] + match[2] + match[1] + equation[d:]

            if equivalent != equation and equivalent in equations:
                remove.add(equivalent)
                if DEBUG:
                    print(f"Removed {equivalent} same as {equation}", file=sys.stderr)

equations -= remove

Because we've built all equations as ab op c op d op, etc. I don't believe we generate the associative equivalents, but if we did, we could use a similar technique to try to thin them out:

remove = set()

# Reduce associative equivalents aa+b*c- same as ab*ab*+c-
if DEBUG:
    print("Reduce associative equivalents:", file=sys.stderr)

for equation in equations:
    if equation not in remove:
        for match in re.finditer(r"(?=(\w)([+])(\w)([*]))", equation):

            a, _ = match.span(1)
            _, d = match.span(4)

            equivalent = equation[:a] + match[3] + match[4] + match[1] + match[3] + match[4] + match[2] + equation[d:]

            if equivalent != equation and equivalent in equations:
                remove.add(equivalent)
                if DEBUG:
                    print(f"Removed {equivalent} same as {equation}", file=sys.stderr)

equations -= remove

And finally dump out our reduced set:

if DEBUG:
    print("Final equations:", file=sys.stderr)

print(equations)

OUTPUT

> python3 test.py
Reduce commutative equivalents:
Removed ac+a-b/ same as ca+a-b/
Removed ab*a/c- same as ba*a/c-
Removed cb*a/a- same as bc*a/a-
Removed ac+b-a/ same as ca+b-a/
Removed ba+c/a- same as ab+c/a-
Removed ba+a-c/ same as ab+a-c/
Removed ac+a/b- same as ca+a/b-
Removed ac+b/a- same as ca+b/a-
Removed ac*b-a/ same as ca*b-a/
Removed bc*a-a/ same as cb*a-a/
Removed ca*a-b/ same as ac*a-b/
Removed ba*a-c/ same as ab*a-c/
Removed cb+a/a- same as bc+a/a-
Removed ba+c-a/ same as ab+c-a/
Removed ca*a/b- same as ac*a/b-
Removed ca*b/a- same as ac*b/a-
Removed ba+a/c- same as ab+a/c-
Removed ab*c-a/ same as ba*c-a/
Removed ab*c/a- same as ba*c/a-
Removed cb+a-a/ same as bc+a-a/
Reduce associative equivalents:
Final equations:
{'ca+a-b/', 'cb*a+a-', 'aa/b-c*', 'ba/c-a*', 'cb/a-a*', 'ab+a*c/', 'aa/c+b-',
'bc/a-a+', 'aa*b+c-', 'ba*a/c-', 'ab+c/a*', 'ca-a/b+', 'ca-b+a*', 'bc*a/a-',
'bc/a+a*', 'ac+a/b*', 'bc+a*a-', 'ca/a-b+', 'ac-a*b+', 'ba-a*c/', 'ac/b-a*',
'ba-c+a*', 'ba+a-c*', 'aa+b/c-', 'ca-b*a/', 'ca+b-a/', 'ab+c/a-', 'ac*b+a-',
'aa+c-b/', 'aa*c/b-', 'ab/c*a+', 'ac+b/a*', 'aa+b*c/', 'ab-a*c+', 'ac+a-b*',
'cb-a+a*', 'cb*a/a+', 'ab-c/a+', 'ac*b+a/', 'ba*c/a+', 'ba/c+a*', 'aa-b*c+',
'aa/b+c*', 'ab-c*a+', 'ac+a*b/', 'ac/b+a-', 'aa*b-c+', 'ac-a+b/', 'aa-c*b+',
'ab+a-c/', 'aa-c+b/', 'ba+c*a/', 'ca-b*a+', 'ab-a/c*', 'aa-b/c+', 'ac*a+b/',
'ba/a+c-', 'ba-c/a+', 'cb/a+a*', 'ca+b/a*', 'aa/c*b+', 'ac-a+b*', 'ba-a+c*',
'ca+a*b/', 'aa+b/c*', 'aa/c-b+', 'bc*a/a+', 'ca+a/b-', 'ca+b/a-', 'ca*b-a/',
'ac/b*a-', 'aa*b/c+', 'ba/a*c+', 'bc/a*a+', 'ca-b+a/', 'ac/b+a*', 'aa*b/c-',
'bc-a+a/', 'ca/b-a*', 'ba-c*a/', 'cb*a-a/', 'ba-c/a*', 'aa*b+c/', 'ac*a-b/',
'ca*b/a+', 'aa+b-c*', 'ba/a-c*', 'ca-b/a+', 'ab/c-a+', 'cb+a/a*', 'aa-c/b*',
'ba+c*a-', 'cb*a+a/', 'aa*c/b+', 'ab/c+a*', 'ca+b-a*', 'aa+b-c/', 'ac-b*a/',
'ab*a-c/', 'ba-a*c+', 'ba*c+a-', 'bc/a*a-', 'ba*c-a+', 'ba/c*a+', 'ab-c+a/',
'ba*c+a/', 'ca*a-b+', 'bc+a/a-', 'aa+c*b-', 'ab+c*a-', 'ac-a/b+', 'ca+a-b*',
'aa+c-b*', 'ab/c*a-', 'ab+c-a/', 'bc+a/a*', 'ac-a/b*', 'ab/a-c*', 'ac/a-b+',
'bc-a/a+', 'ab+a*c-', 'ac/a-b*', 'ca*a+b-', 'ab/a-c+', 'ab-a*c/', 'cb/a*a-',
'ac/a+b*', 'bc-a/a*', 'ac-b+a*', 'ac*a/b-', 'ba*a+c-', 'ba/a-c+', 'bc/a+a-',
'aa/b-c+', 'cb+a-a*', 'ca-b/a*', 'ca+b*a-', 'ac*b/a-', 'ca-a+b/', 'ca/b*a-',
'ba+a/c*', 'cb-a*a+', 'ac+a*b-', 'aa*b-c/', 'aa*c-b/', 'ac/a*b+', 'aa-c+b*',
'ca*a+b/', 'ca/b+a-', 'ac*a/b+', 'aa+c/b-', 'ab/c+a-', 'ab+a/c-', 'cb-a+a/',
'ab*a-c+', 'ab-a+c*', 'ab+a/c*', 'ac/b-a+', 'ab*c+a/', 'ba/c+a-', 'ba/c*a-',
'cb-a*a/', 'ac+b*a-', 'ba+c-a*', 'ac/b*a+', 'cb/a*a+', 'cb-a/a+', 'bc*a+a/',
'ac*b/a+', 'cb+a*a-', 'ba*c-a/', 'ca-a*b/', 'ca-a*b+', 'ab/a*c-', 'ba-a+c/',
'ba*a/c+', 'bc-a+a*', 'ca+a/b*', 'ca*a/b+', 'aa*c+b-', 'ba*c/a-', 'bc/a-a*',
'ca/a+b*', 'ab-a+c/', 'ca/b*a+', 'ab-a/c+', 'cb*a-a+', 'aa-b/c*', 'ac-b/a+',
'aa*c-b+', 'ab*c+a-', 'cb/a-a+', 'ab/a+c*', 'ba+a*c-', 'ba*a+c/', 'ba-a/c*',
'aa/b+c-', 'ba/c-a+', 'ca/b-a+', 'ab*a/c+', 'bc+a-a*', 'bc*a-a+', 'ab+c*a/',
'ab-c*a/', 'ac*a+b-', 'ca/a+b-', 'ac/a*b-', 'ac+b-a*', 'ba/a+c*', 'ba-a/c+',
'ab*c/a+', 'cb/a+a-', 'ca/a-b*', 'ac-b/a*', 'ab/a*c+', 'ca*b+a/', 'ac-a*b/',
'aa/b*c+', 'aa/c-b*', 'ca/a*b+', 'bc-a*a/', 'ca+b*a/', 'aa*c+b/', 'ab*a+c/',
'bc+a*a/', 'ab-c/a*', 'ca-a+b*', 'aa-c*b/', 'cb-a/a*', 'aa+b*c-', 'ca+a*b-',
'aa-b+c*', 'ac/a+b-', 'ba-c+a/', 'ba-c*a+', 'ca*b-a+', 'ac-b+a/', 'aa-b*c/',
'aa-b+c/', 'ac*a-b+', 'ac+b*a/', 'ca/a*b-', 'bc+a-a/', 'bc-a*a+', 'ba+a*c/',
'ac*b-a+', 'aa/c+b*', 'ab/a+c-', 'ab/c-a*', 'ab-c+a*', 'ba+c/a*', 'ab*c-a+',
'ab+a-c*', 'cb+a*a/', 'ac-b*a+', 'ba/a*c-', 'ab*a+c-', 'ab+c-a*', 'bc*a+a-',
'aa/b*c-', 'ca*b+a-', 'ba*a-c+', 'ca/b+a*', 'aa-c/b+', 'aa+c/b*', 'ca-a/b*',
'aa/c*b-', 'aa+c*b/'}
>

I'm not claiming a perfect solution, just illustrating some of the tools available to you to solve your problem.

answered Sep 16 '22 18:09

cdlane

To create all the possible expressions we can consider every expression a binary expression tree and then the notation will be just a matter of traversing the tree differently. For example:

tree:                          *
                              / \
             +               -   c
            / \             / \
           a   b           a   b

infix:     a + b          (a - b) * c
postfix    a b +           a b - c *

Since all the required operators are binary, the resulting expression trees are full binary trees, meaning all non-leaf nodes have exactly two children. Another property of binary expression trees is that all operands are the leaves of the tree and all internal nodes are operators, and the number of internal nodes (operators) is one less than the number of the leaves (operands).

Now to create all the possible expressions, first we need all the structurally distinct full binary trees with len(operands) leaves or len(operands)-1 internal nodes.

I use a generator written by the answerer of this question: generate all structurally distinct full binary trees with n leaves.

The code below generates all structurally distinct full binary trees with n leaves. It outputs the tree structure with some notation that you can set in the function. This one is set to show subtrees wrapped in parentheses, operands as x and operators as o. For example for 2 operators and 3 operands:

(xo(xox))       ((xox)ox)
    o               o
   / \             / \
  x   o           o   x
     / \         / \
    x   x       x   x

from itertools import product

def expr_trees(n):
    if n == 1:
        yield 'x'

    for i in range(1, n):
        left = expr_trees(i)
        right = expr_trees(n-i)

        for l, r in product(left, right):
            yield '('+l+'o'+r+')'

for t in expr_trees(3):
    print(t)

Now to generate all the possible expressions we need to place all the permutations without repetition of operands on the leaves and all the permutations of length len(operands)-1 of operators with repetition, at the internal nodes of every tree structure. Here we alter the generator function to use the list of operators and operands and output postfix expressions:

from itertools import permutations, product

def expressions(opds, oprs, idx):
    if len(opds) == 1:
        yield opds[0]

    for i in range(1, len(opds)):
        left = expressions(opds[0:i], oprs, idx+1)

        right = expressions(opds[i:], oprs, idx+1)

        for l, r in product(left, right):
            yield l+r+oprs[idx]

operands = ['a', 'b', 'c']
operators = ['+', '-', '*', '/']

operatorProducts = product(operators, repeat=len(operands)-1)
operandPermutations = permutations(operands)

for opds, oprs in product(operandPermutations, operatorProducts):
    for t in expressions(opds, oprs, 0):
        print(t)

Now about the time complexity. As an example let's calculate the number of all structurally distinct expressions for ['a', 'b', 'c'].

As we saw earlier there are two full binary trees for three operands. The number of permutations of the operands is 3! = 6 and the number of permutations of operators is 4^2 because we choose 2 out of 4 with repetition allowed. Therefore we have:

number of expressions
    = number of trees * number of operand permutations * number of operator permutations
    = 2 * 6 * 16
    = 192

For the general formula the interesting part is the number of structurally distinct binary trees which is the nth Catalan number with n being the number of internal nodes of the tree. You can read more about it at the answer to Counting binary trees.

number of trees with n internal nodes = (1 / n+1) x (2n)! / (n! x n!)

Therefore the number of structurally distinct expressions with n operators or n+1 operands:

(n+1)! x 4^n x (1/n+1) x (2n)! / (n! x n!) = 4^n x (2n)! / n!

(excuse ugly math formulas because of lack of support here. x is multiplication. You can find nicer formatting at the links above.)

Note that n is the number operators or the number of operands - 1.

As you can see the number of possible expressions grows extremely fast with n.

1, 8, 192, 7680, 430080, 30965760, ...

Although there are many equivalent expressions, still they are a small portion of all the expressions and you should think of a practical limit for the number of operands.

That brings us to the next problem which is finding equivalent expressions. It might seem simple at first as one might think it's only about commutative property of + and * but there are cases of - and / changing the rest of the expression in complicated ways which is difficult to catch by just a simple RegExp, IMO. For example abc-- is equivalent of ab-c+ because of unary effect of minus on the parenthesized elements and a more complicated version with the inversion effect of division, abcde+-*/ which is equivalent of abcd-e-//. Adding repeated elements to the list of operands creates more equivalent expressions and makes it even more difficult to catch them all.

I find it much complicated to find all the equivalent expressions and in my opinion your best bet is to implement a function that expands, simplifies and sorts all the terms so that you have a simplified version of each group of equivalent expressions for comparison.

answered Sep 19 '22 18:09

jal

Related questions
                            
                                How to pip install *.whl on Windows (using a wildcard)
                            
                                Mask out sensitive information in python log
                            
                                Print all columns and rows of a numpy array [duplicate]
                            
                                Django save previous object from models
                            
                                Randomly sample from multiple tf.data.Datasets in Tensorflow
                            
                                What's the right way to insert a CalibratedClassifierCV in a scikit-learn pipeline?
                            
                                Drop duplicates keeping the row with the highest value in another column
                            
                                Python memory not being released on linux?
                            
                                How are Counter / defaultdict ordered in Python 3.7?
                            
                                Pandas dataframe drop columns with no header
                            
                                Engines in Python Pandas read_csv
                            
                                Why do dict_items objects not support indexing?
                            
                                pytorch freeze weights and update param_groups
                            
                                Is it bad practice to have arguments called in main() in Python
                            
                                How can I go to a function's definition in Jupyter notebook?
                            
                                Plot data on satellite maps
                            
                                PRAW 6: Get all submission of a subreddit
                            
                                How to convert pendulum to datetime.datetime type?
                            
                                Jupyterlab and Plotly offline: requirejs is not defined
                            
                                Dask: delayed vs futures and task graph generation [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With