I've been working through some algorithm programming problems and have a question about one. The problem is the same one as the one referenced in this question: USACO: Subsets (Inefficient)
I was able to code some (non-dynamic) solutions that were too slow for high N. Had to cheat a bit and look up some solutions online. It turns out the fast algorithm is trivially simple, but even knowing the answer I still can't see how to get from the problem to the answer. I can see patterns in the way subsets of equal sums form, but I don't see the link between those patterns and the algorithmic solution.
The problem (link above) is:
Given a set of consecutive integers from 1 through N (1 <= N <= 39), how many ways can the set be partitioned into two subsets whose sums are identical? E.g., {1,2,3} can be partitioned one way: {1,2} {3}.
For larger sets the answer is either 0 (when N*(N+1)/2 is odd) or is given by this simple algorithm:
arr = array of int with (N*(N+1)/4)+1 elements
arr[0]=1 // all other elements initialized to 0
for i = 1 to N
for j = N*(N+1) / 4 downto i
add arr[j-i] to arr[j]
subsetpaircount = arr[N*(N+1)/4] / 2
Again, I can see how the algorithm works, I've even inserted print statements so I can "watch" how it works. I just can't see how the operation of the algorithm links up to the pattern of different ways the two-set partitions are generated.
A response in the linked question may be relevant, but I also can't connect up how it works: "This is the same thing as finding the coefficient x^0 term in the polynomial (x^1+1/x)(x^2+1/x^2)...(x^n+1/x^n). . . ."
Can anyone clarify the connection for me, or point me to some reference materials that explain this specific problem? Thanks.
If the set S = {1,...,N}
is partitioned into two subsets with equal sum, that sum must be half of the sum of S
; the sum of S
is N(N+1)/2
so the sum of each subset in the partition must be N(N+1)/4
. It must also be an integer, hence N(N+1)/2
must be even.
So the question reduces to finding the number of subsets of S whose sum is N(N+1)/4
. The total number of partitions will be exactly half this number, since each partition contains two such subsets, and no two partitions share a subset.
That much should be obvious.
Now, let's list the subsets of S
which sum to k
, for any k
and any set S
. Any such subset must have a greatest value, which must be an element of S
. The greatest value must either be the greatest element of S
, or it must be less than the greatest element of S
. These two groups of subsets are distinct, so we can enumerate them separately. Let's call the greatest element of S
Smax
.
The second group is simple: it's simply the subsets of S - { Smax }
which sum to k
. We can find those by recursively calling the subset lister. But the first group is almost as simple: each set in the group includes Smax
and the rest of its elements are some set in S - { Smax }
which adds up to k - Smax
, which again we can list recursively. To complete the recursion, we note that if S
is the empty set, then if k = 0
, there is precisely one qualify subset (the empty set itself), and if k is not 0, then there are no qualifying subsets. Each recursion removes one element from S
, so the termination condition must eventually be reached.
It should be clear that the subsets of S
which will be used by the above recursive function are just the numbers from 1
to Smax
, so we can get rid of S
altogether, and write the recursion as follows:
Subsets(min, max, k) =
Subsets(min, max - 1, k)
⋃ { {max, P} | P ∈ Subsets(min, max - 1, k - max) }
But we only want the count of the number of partitions, so we can simplify that a bit:
Count_Subsets(min, max, k) =
Count_Subsets(min, max - 1, k)
+ Count_Subsets(min, max - 1, k - max)
We need to complete the recursion by adding the end condition:
If min > max, Count_Subsets(min, max, k) = 1 if k = 0; otherwise 0
(In fact, it's simple to show that the recursion implies that the value will be 1
when k decrements to 0
, and 0
if k
is less than 0
, so we can make the termination condition happen a lot earlier.)
That gives us the simple recursion for the count. But since it calls itself twice, it would be better to work backwards ("dynamic programming"). We need to compute Count_Subsets(1, N, N*(N+1)/4)
, which will require us to compute values of Count_Subsets(1, max, k)
for all values of max from 1 to N, and from all values of k from 0 to N*(N+1)/4. We do that by starting with max = 0, and working up until we reach min = N. And that's precisely what your algorithm does; i
is max
, and the array is the set of values for k from 0 to N(N+1)/4
.
By the way, as should be apparent from the above description, a[j]
should be incremented by a[j - i]
, not by a[j - 1]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With