Fast solution to Subset sum

https://stackoverflow.com/questions/9809436

25-05-2021
|

質問

Consider this way of solving the Subset sum problem:

def subset_summing_to_zero (activities):
  subsets = {0: []}
  for (activity, cost) in activities.iteritems():
      old_subsets = subsets
      subsets = {}
      for (prev_sum, subset) in old_subsets.iteritems():
          subsets[prev_sum] = subset
          new_sum = prev_sum + cost
          new_subset = subset + [activity]
          if 0 == new_sum:
              new_subset.sort()
              return new_subset
          else:
              subsets[new_sum] = new_subset
  return []

I have it from here:

http://news.ycombinator.com/item?id=2267392

There is also a comment which says that it is possible to make it "more efficient".

How?

Also, are there any other ways to solve the problem which are at least as fast as the one above?

Edit

I'm interested in any kind of idea which would lead to speed-up. I found:

https://en.wikipedia.org/wiki/Subset_sum_problem#cite_note-Pisinger09-2

which mentions a linear time algorithm. But I don't have the paper, perhaps you, dear people, know how it works? An implementation perhaps? Completely different approach perhaps?

Edit 2

There is now a follow-up:
Fast solution to Subset sum algorithm by Pisinger

解決

While my previous answer describes the polytime approximate algorithm to this problem, a request was specifically made for an implementation of Pisinger's polytime dynamic programming solution when all x_i in x are positive:

from bisect import bisect

def balsub(X,c):
    """ Simple impl. of Pisinger's generalization of KP for subset sum problems
    satisfying xi >= 0, for all xi in X. Returns the state array "st", which may
    be used to determine if an optimal solution exists to this subproblem of SSP.
    """
    if not X:
        return False
    X = sorted(X)
    n = len(X)
    b = bisect(X,c)
    r = X[-1]
    w_sum = sum(X[:b])
    stm1 = {}
    st = {}
    for u in range(c-r+1,c+1):
        stm1[u] = 0
    for u in range(c+1,c+r+1):
        stm1[u] = 1
    stm1[w_sum] = b
    for t in range(b,n+1):
        for u in range(c-r+1,c+r+1):
            st[u] = stm1[u]
        for u in range(c-r+1,c+1):
            u_tick = u + X[t-1]
            st[u_tick] = max(st[u_tick],stm1[u])
        for u in reversed(range(c+1,c+X[t-1]+1)):
            for j in reversed(range(stm1[u],st[u])):
                u_tick = u - X[j-1]
                st[u_tick] = max(st[u_tick],j)
    return st

Wow, that was headache-inducing. This needs proofreading, because, while it implements balsub, I can't define the right comparator to determine if the optimal solution to this subproblem of SSP exists.

他のヒント

I respect the alacrity with which you're trying to solve this problem! Unfortunately, you're trying to solve a problem that's NP-complete, meaning that any further improvement that breaks the polynomial time barrier will prove that P = NP.

The implementation you pulled from Hacker News appears to be consistent with the pseudo-polytime dynamic programming solution, where any additional improvements must, by definition, progress the state of current research into this problem and all of its algorithmic isoforms. In other words: while a constant speedup is possible, you're very unlikely to see an algorithmic improvement to this solution to the problem in the context of this thread.

However, you can use an approximate algorithm if you require a polytime solution with a tolerable degree of error. In pseudocode blatantly stolen from Wikipedia, this would be:

initialize a list S to contain one element 0.
 for each i from 1 to N do
   let T be a list consisting of xi + y, for all y in S
   let U be the union of T and S
   sort U
   make S empty 
   let y be the smallest element of U 
   add y to S 
   for each element z of U in increasing order do
      //trim the list by eliminating numbers close to one another
      //and throw out elements greater than s
     if y + cs/N < z ≤ s, set y = z and add z to S 
 if S contains a number between (1 − c)s and s, output yes, otherwise no

Python implementation, preserving the original terms as closely as possible:

from bisect import bisect

def ssum(X,c,s):
    """ Simple impl. of the polytime approximate subset sum algorithm 
    Returns True if the subset exists within our given error; False otherwise 
    """
    S = [0]
    N = len(X)
    for xi in X:
        T = [xi + y for y in S]
        U = set().union(T,S)
        U = sorted(U) # Coercion to list
        S = []
        y = U[0]
        S.append(y)
        for z in U: 
            if y + (c*s)/N < z and z <= s:
                y = z
                S.append(z)
    if not c: # For zero error, check equivalence
        return S[bisect(S,s)-1] == s
    return bisect(S,(1-c)*s) != bisect(S,s)

... where X is your bag of terms, c is your precision (between 0 and 1), and s is the target sum.

For more details, see the Wikipedia article.

(Additional reference, further reading on CSTheory.SE)

I don't know much python, but there is an approach called meet in the middle. Pseudocode:

Divide activities into two subarrays, A1 and A2
for both A1 and A2, calculate subsets hashes, H1 and H2, the way You do it in Your question.
for each (cost, a1) in H1
     if(H2.contains(-cost))
         return a1 + H2[-cost];

This will allow You to double the number of elements of activities You can handle in reasonable time.

I apologize for "discussing" the problem, but a "Subset Sum" problem where the x values are bounded is not the NP version of the problem. Dynamic programing solutions are known for bounded x value problems. That is done by representing the x values as the sum of unit lengths. The Dynamic programming solutions have a number of fundamental iterations that is linear with that total length of the x's. However, the Subset Sum is in NP when the precision of the numbers equals N. That is, the number or base 2 place values needed to state the x's is = N. For N = 40, the x's have to be in the billions. In the NP problem the unit length of the x's increases exponentially with N.That is why the dynamic programming solutions are not a polynomial time solution to the NP Subset Sum problem. That being the case, there are still practical instances of the Subset Sum problem where the x's are bounded and the dynamic programming solution is valid.

Here are three ways to make the code more efficient:

The code stores a list of activities for each partial sum. It is more efficient in terms of both memory and time to just store the most recent activity needed to make the sum, and work out the rest by backtracking once a solution is found.
For each activity the dictionary is repopulated with the old contents (subsets[prev_sum] = subset). It is faster to simply grow a single dictionary
Splitting the values in two and applying a meet in the middle approach.

Applying the first two optimisations results in the following code which is more than 5 times faster:

def subset_summing_to_zero2 (activities):
  subsets = {0:-1}
  for (activity, cost) in activities.iteritems():
      for prev_sum in subsets.keys():
          new_sum = prev_sum + cost
          if 0 == new_sum:
              new_subset = [activity]
              while prev_sum:
                  activity = subsets[prev_sum]
                  new_subset.append(activity)
                  prev_sum -= activities[activity]
              return sorted(new_subset)
          if new_sum in subsets: continue
          subsets[new_sum] = activity
  return []

Also applying the third optimisation results in something like:

def subset_summing_to_zero3 (activities):
  A=activities.items()
  mid=len(A)//2
  def make_subsets(A):
      subsets = {0:-1}
      for (activity, cost) in A:
          for prev_sum in subsets.keys():
              new_sum = prev_sum + cost
              if new_sum and new_sum in subsets: continue
              subsets[new_sum] = activity
      return subsets
  subsets = make_subsets(A[:mid])
  subsets2 = make_subsets(A[mid:])

  def follow_trail(new_subset,subsets,s):
      while s:
         activity = subsets[s]
         new_subset.append(activity)
         s -= activities[activity]

  new_subset=[]
  for s in subsets:
      if -s in subsets2:
          follow_trail(new_subset,subsets,s)
          follow_trail(new_subset,subsets2,-s)
          if len(new_subset):
              break
  return sorted(new_subset)

Define bound to be the largest absolute value of the elements. The algorithmic benefit of the meet in the middle approach depends a lot on bound.

For a low bound (e.g. bound=1000 and n=300) the meet in the middle only gets a factor of about 2 improvement other the first improved method. This is because the dictionary called subsets is densely populated.

However, for a high bound (e.g. bound=100,000 and n=30) the meet in the middle takes 0.03 seconds compared to 2.5 seconds for the first improved method (and 18 seconds for the original code)

For high bounds, the meet in the middle will take about the square root of the number of operations of the normal method.

It may seem surprising that meet in the middle is only twice as fast for low bounds. The reason is that the number of operations in each iteration depends on the number of keys in the dictionary. After adding k activities we might expect there to be 2**k keys, but if bound is small then many of these keys will collide so we will only have O(bound.k) keys instead.

Thought I'd share my Scala solution for the discussed pseudo-polytime algorithm described in wikipedia. It's a slightly modified version: it figures out how many unique subsets there are. This is very much related to a HackerRank problem described at https://www.hackerrank.com/challenges/functional-programming-the-sums-of-powers. Coding style might not be excellent, I'm still learning Scala :) Maybe this is still helpful for someone.

object Solution extends App {
    var input = "1000\n2"

    System.setIn(new ByteArrayInputStream(input.getBytes()))        

    println(calculateNumberOfWays(readInt, readInt))

    def calculateNumberOfWays(X: Int, N: Int) = {
            val maxValue = Math.pow(X, 1.0/N).toInt

            val listOfValues = (1 until maxValue + 1).toList

            val listOfPowers = listOfValues.map(value => Math.pow(value, N).toInt)

            val lists = (0 until maxValue).toList.foldLeft(List(List(0)): List[List[Int]]) ((newList, i) => 
                    newList :+ (newList.last union (newList.last.map(y => y + listOfPowers.apply(i)).filter(z => z <= X)))
            )

            lists.last.count(_ == X)        

    }
}

ライセンス： CC-BY-SA と帰属

所属していません StackOverflow