Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sum-of-Product of subsets

Is there a name for this operation? And: is there a closed-form expression?

  • For a given set of n elements, and value k between 1 and n,
  • Take all subsets (combinations) of k items
  • Find the product of each subset
  • Find the sum of all those products

I can express this in Python, and do the calculation pretty easily:

from operator import mul
from itertools import combinations
from functools import reduce
def sum_of_product_of_subsets(list1, k):
    val = 0
    for subset in combinations(list1, k):
        val += reduce(mul, subset)
    return val

I'm just looking for the closed form expression, so as to avoid the loop in case the set size gets big.

Note this is NOT the same as this question: Sum of the product over all combinations with one element from each group -- that question is about the sum-of-products of a Cartesian product. I'm looking for the sum-of-products of the set of combinations of size k; I don't think they are the same.

To be clear, for set(a, b, c, d), then:

k = 4 --> a*b*c*d
k = 3 --> b*c*d + a*c*d + a*b*d + a*b*c
k = 2 --> a*b + a*c + a*d + b*c + b*d + c*d
k = 1 --> a + b + c + d

Just looking for the expression; no need to supply the Python code specifically. (Any language would be illustrative, if you'd like to supply an example implementation.)

like image 602
Dan H Avatar asked Apr 11 '12 12:04

Dan H


2 Answers

These are elementary symmetric polynomials. You can write them using summation signs as in Wikipedia. You can also use Vieta's formulas to get all of them at once as coefficients of a polynomial (up to signs)

(x-a_1)(x-a_2)...(x-a_k) =
   x^k -
   (a_1 + ... + a_k) x^(k-1) +
   (a_1 a_2 + a_1 a_3 + ... + a_(k-1) a_k)) x^(k-2) +
   ... +
   (-1)^k a_1 ... a_k

By expanding (x-a_1)(x-a_2)...(x-a_k) you get a polynomial time algorithm to compute all those numbers (your original implementation runs in exponential time).

Edit: Python implementation:

from itertools import izip, chain

l = [2,3,4]

x = [1]    
for i in l:
    x = [a + b*i for a,b in izip(chain([0],x), chain(x,[0]))]
print x

That gives you [24, 26, 9, 1], as 2*3*4=24, 2*3+2*4+3*4=26, 2+3+4=9. That last 1 is the empty product, which corresponds to k=0 in your implementation.

This should be O(N2). Using polynomial FFT you could do O(N log2 N), but I am too lazy to code that.

like image 136
sdcvvc Avatar answered Sep 18 '22 15:09

sdcvvc


I have just run into the same problem elsewhere and I might have an easier solution. Basically the closed form you are looking for is this one:

(1+e_1)*(1+e_2)*(1+e_3)*...*(1+e_n) - 1

where considering the set S={e_1, e_2, ..., e_n}

Here is why:

Let 'm' be the product of the elements of S (n=e_1*e_2*...*e_n).
If you look at the original products of elements of subsets, you can see, that all of those products are divisors of 'm'.
Now apply the Divisor function to 'm' (from now on called sigma(m) ) with one modification: consider all e_i elements as 'primes' (because we don't want them to be divided), so this gives sigma(e_i)=e_i+1 .

Then if you apply sigma to m:

sigma(m)=sigma(e_1*e_2*...*e_n)=1+[e_1+e_2+...+e_n]+[e_1*e_2+e_1*e_3+...+e_(n-1)*e_n]+[e_1*e_2*e_3+e_1*e_2*e_3+...+e_(n-2)]+...+[e_1*e_2*...*e_n]

This is what the original problem was. (Except for the 1 in the beginning). Our divisor function is multiplicative, so the previous equation can be rewritten as following:

sigma(m)=(1+e_1)*(1+e_2)*(1+e_3)*...*(1+e_n)
There is one correction you need here. It is because of the empty subset (which is taken into account here, but in the original problem it is not present), which includes '1' in the sum (in the beginning of the firs equation). So the closed form, what you need is:
(1+e_1)*(1+e_2)*(1+e_3)*...*(1+e_n) - 1

Sorry, I can't really code that, but I think the computation shouldn't take more than 2n-1 loops.

(You can read more about the divisor function here: http://en.wikipedia.org/wiki/Divisor_function)

like image 37
DanL Avatar answered Sep 16 '22 15:09

DanL