I have 2 sets, set A contains set of random numbers and set B's elements are sum of set A's subsets.
For example,
A = [8, 9, 15, 15, 33, 36, 39, 45, 46, 60, 68, 73, 80, 92, 96]
B = [183, 36, 231, 128, 137]
I want to find which number is sum of which subset with data like this.
S = [[45, 46, 92], [36], [8, 15, 39, 73, 96], [60, 68], [9, 15, 33, 80]]
I was able to write really dumb brute force code with python.
class SolvedException(BaseException):
pass
def solve(sums, nums, answer):
num = nums[-1]
for i in range(0, len(sums)):
sumi = sums[i]
if sumi == 0:
continue
elif sumi - num < 0:
continue
answer[i].append(num)
sums[i] = sumi - num
if len(nums) != 1:
solve(sums, nums[:-1], answer)
elif sumi - num == 0:
raise SolvedException(answer)
sums[i] = sumi
answer[i].pop()
try:
solve(B, A, [list() for i in range(0, len(B))])
except SolvedException as e:
print e.args[0]
This code works pretty well for small datas, but it will take billion years to calculate my data(which has 71 numbers and 10 sums).
I could use some better algoritms or optimization.
Sorry for my bad English and terrible unefficient code.
Edit : Sorry, I realized that I didn't described the problem accurately.
As every single element in A
are used to make elemens of B, sum(A) == sum(B)
Also, set S
must be partition of set A
.
This is known as the subset-sum problem and it is a well known NP-complete problem. So basically there is no efficient solution. See for example https://en.wikipedia.org/wiki/Subset_sum_problem
However If your number N is not too large, there is a pseudo polynomial algorithms, using dynamic programming: You read the list A from left to right and keep the list of the sum which are doable and smaller than N. If you know the number which are doable for a given A, you can easily get those which are doable for A + [a]. Hence the dynamic programming. It will typically be fast enough for a problem of the size you gave there.
Here is a Python quick solution:
def subsetsum(A, N):
res = {0 : []}
for i in A:
newres = dict(res)
for v, l in res.items():
if v+i < N:
newres[v+i] = l+[i]
elif v+i == N:
return l+[i]
res = newres
return None
Then
>>> A = [8, 9, 15, 15, 33, 36, 39, 45, 46, 60, 68, 73, 80, 92, 96]
>>> subsetsum(A, 183)
[15, 15, 33, 36, 39, 45]
After OP edit:
Now I correctly understand you problem, I'll still think that your problem can be solved efficiently, provided you have an efficient subset-sum solver: I'd use divide and conquer solution on B:
However, your (71, 10) problem below is out of reach for the dynamic programming solution I suggested.
By the way, here is a quick solution of your problem not using divide and conquer, but which contains the correct adaptation of my dynamic solver to get all solutions:
class NotFound(BaseException):
pass
from collections import defaultdict
def subset_all_sums(A, N):
res = defaultdict(set, {0 : {()}})
for nn, i in enumerate(A):
# perform a deep copy of res
newres = defaultdict(set)
for v, l in res.items():
newres[v] |= set(l)
for v, l in res.items():
if v+i <= N:
for s in l:
newres[v+i].add(s+(i,))
res = newres
return res[N]
def list_difference(l1, l2):
## Similar to merge.
res = []
i1 = 0; i2 = 0
while i1 < len(l1) and i2 < len(l2):
if l1[i1] == l2[i2]:
i1 += 1
i2 += 1
elif l1[i1] < l2[i2]:
res.append(l1[i1])
i1 += 1
else:
raise NotFound
while i1 < len(l1):
res.append(l1[i1])
i1 += 1
return res
def solve(A, B):
assert sum(A) == sum(B)
if not B:
return [[]]
res = []
ss = subset_all_sums(A, B[0])
for s in ss:
rem = list_difference(A, s)
for sol in solve(rem, B[1:]):
res.append([s]+sol)
return res
Then:
>>> solve(A, B)
[[(15, 33, 39, 96), (36,), (8, 15, 60, 68, 80), (9, 46, 73), (45, 92)],
[(15, 33, 39, 96), (36,), (8, 9, 15, 46, 73, 80), (60, 68), (45, 92)],
[(8, 15, 15, 33, 39, 73), (36,), (9, 46, 80, 96), (60, 68), (45, 92)],
[(15, 15, 73, 80), (36,), (8, 9, 33, 39, 46, 96), (60, 68), (45, 92)],
[(15, 15, 73, 80), (36,), (9, 39, 45, 46, 92), (60, 68), (8, 33, 96)],
[(8, 33, 46, 96), (36,), (9, 15, 15, 39, 73, 80), (60, 68), (45, 92)],
[(8, 33, 46, 96), (36,), (15, 15, 60, 68, 73), (9, 39, 80), (45, 92)],
[(9, 15, 33, 46, 80), (36,), (8, 15, 39, 73, 96), (60, 68), (45, 92)],
[(45, 46, 92), (36,), (8, 15, 39, 73, 96), (60, 68), (9, 15, 33, 80)],
[(45, 46, 92), (36,), (8, 15, 39, 73, 96), (15, 33, 80), (9, 60, 68)],
[(45, 46, 92), (36,), (15, 15, 60, 68, 73), (9, 39, 80), (8, 33, 96)],
[(45, 46, 92), (36,), (9, 15, 15, 39, 73, 80), (60, 68), (8, 33, 96)],
[(9, 46, 60, 68), (36,), (8, 15, 39, 73, 96), (15, 33, 80), (45, 92)]]
>>> %timeit solve(A, B)
100 loops, best of 3: 10.5 ms per loop
So it is quite fast for this size of problem, though nothing is optimized here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With