質問

I've been working through some algorithm programming problems and have a question about one. The problem is the same one as the one referenced in this question: USACO: Subsets (Inefficient)

I was able to code some (non-dynamic) solutions that were too slow for high N. Had to cheat a bit and look up some solutions online. It turns out the fast algorithm is trivially simple, but even knowing the answer I still can't see how to get from the problem to the answer. I can see patterns in the way subsets of equal sums form, but I don't see the link between those patterns and the algorithmic solution.

The problem (link above) is:

Given a set of consecutive integers from 1 through N (1 <= N <= 39), how many ways can the set be partitioned into two subsets whose sums are identical? E.g., {1,2,3} can be partitioned one way: {1,2} {3}.

For larger sets the answer is either 0 (when N*(N+1)/2 is odd) or is given by this simple algorithm:

  arr = array of int with (N*(N+1)/4)+1 elements
  arr[0]=1  // all other elements initialized to 0
  for i = 1 to N
    for j = N*(N+1) / 4 downto i
      add arr[j-i] to arr[j]
  subsetpaircount = arr[N*(N+1)/4] / 2

Again, I can see how the algorithm works, I've even inserted print statements so I can "watch" how it works. I just can't see how the operation of the algorithm links up to the pattern of different ways the two-set partitions are generated.

A response in the linked question may be relevant, but I also can't connect up how it works: "This is the same thing as finding the coefficient x^0 term in the polynomial (x^1+1/x)(x^2+1/x^2)...(x^n+1/x^n). . . ."

Can anyone clarify the connection for me, or point me to some reference materials that explain this specific problem? Thanks.

役に立ちましたか?

解決

If the set S = {1,...,N} is partitioned into two subsets with equal sum, that sum must be half of the sum of S; the sum of S is N(N+1)/2 so the sum of each subset in the partition must be N(N+1)/4. It must also be an integer, hence N(N+1)/2 must be even.

So the question reduces to finding the number of subsets of S whose sum is N(N+1)/4. The total number of partitions will be exactly half this number, since each partition contains two such subsets, and no two partitions share a subset.

That much should be obvious.

Now, let's list the subsets of S which sum to k, for any k and any set S. Any such subset must have a greatest value, which must be an element of S. The greatest value must either be the greatest element of S, or it must be less than the greatest element of S. These two groups of subsets are distinct, so we can enumerate them separately. Let's call the greatest element of S Smax.

The second group is simple: it's simply the subsets of S - { Smax } which sum to k. We can find those by recursively calling the subset lister. But the first group is almost as simple: each set in the group includes Smax and the rest of its elements are some set in S - { Smax } which adds up to k - Smax, which again we can list recursively. To complete the recursion, we note that if S is the empty set, then if k = 0, there is precisely one qualify subset (the empty set itself), and if k is not 0, then there are no qualifying subsets. Each recursion removes one element from S, so the termination condition must eventually be reached.

It should be clear that the subsets of S which will be used by the above recursive function are just the numbers from 1 to Smax, so we can get rid of S altogether, and write the recursion as follows:

Subsets(min, max, k) =
  Subsets(min, max - 1, k)
  ⋃ { {max, P} | P ∈ Subsets(min, max - 1, k - max) }

But we only want the count of the number of partitions, so we can simplify that a bit:

Count_Subsets(min, max, k) =
  Count_Subsets(min, max - 1, k)
  + Count_Subsets(min, max - 1, k - max)

We need to complete the recursion by adding the end condition:

If min > max, Count_Subsets(min, max, k) = 1 if k = 0; otherwise 0

(In fact, it's simple to show that the recursion implies that the value will be 1 when k decrements to 0, and 0 if k is less than 0, so we can make the termination condition happen a lot earlier.)

That gives us the simple recursion for the count. But since it calls itself twice, it would be better to work backwards ("dynamic programming"). We need to compute Count_Subsets(1, N, N*(N+1)/4), which will require us to compute values of Count_Subsets(1, max, k) for all values of max from 1 to N, and from all values of k from 0 to N*(N+1)/4. We do that by starting with max = 0, and working up until we reach min = N. And that's precisely what your algorithm does; i is max, and the array is the set of values for k from 0 to N(N+1)/4.

By the way, as should be apparent from the above description, a[j] should be incremented by a[j - i], not by a[j - 1]

他のヒント

I think there might be a mistake in your pseudocode which is causing the confusion. I would expect the line

add arr[j-1] to arr[j]

to be

add arr[j-i] to arr[j]

Assuming this is the case, then the way to think about this problem is that at the start of each iteration of the loop over i, the array arr[j] contains the number of ways of choosing a subset of the integers 1,2,...,i-1 such that the sum of the chosen integers is exactly j.

When you start, i=1 so the only choice of subset is the empty subset with sum equal to 1.

This is why arr[0]=1 (representing using an empty subset to get a total of 0) while all other entries are 0 (as there is no way of getting a non-zero sum).

From then on, each pass of the iteration considers adding the number i to the subset. The number of ways of getting a sum j will depend on whether i is included or not.

If i is not included, then we have the same number of ways as before (i.e. arr[j]).

If i is included, then in order to have a sum of j including i, we must add i to all the subsets of 1,..,i-1 which had a total of j-i. By design, our array contains exactly this number if we look at index j-i.

So the total number of ways of getting a sum of j becomes arr[j]+arr[j-i].

When the i loop completes, arr gives you the number of ways of choosing a subset and getting any sum you wish. We know that the sum of 1,2,..,n is n*(n-1)/2, so if we count how many subsets reach half this value then we are counting the partitions into equal sums.

Actually, this overcounts by a factor of 2 because it counts {1,2}/{3] and {3}/{1,2} as separate solutions, so the final answer is divided by 2.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top