Question

I need to find an efficient algorithm that finds an optimal set of binary vectors in such a way that every index has a bit that is set in at least one vector in the set.

Amusing motivation: I want to break into a castle and steal its treasure. In order to do it, i have to unlock 4 doors: the blue door, the red one, the yellow and the green. There are 3 dwarfs and each one of them have a different set of keys. dwarf 1 has: blue key & red key. dwarf 2 has: red key & yellow key. dwarf 3 has: blue key & green key. I want to kill minimum amount of dwarfs in order to get into the castle. so in this case, it's obvious that we should kill only dwarf 2 and dwarf 3 to get all the keys and that there is no need to kill dwarf 1.

In general, we have n binary vectors of size m (n dwarfs and m doors): a0,a1....a(n-1). We need a set of vectors such that we have at least one key for every index in the vector. If a1[5]=1, then we know that the 6th bit in the 2nd vector is set. meaning dwarf #2 has key #6. The example in the begining is actually like that: a0 (1,1,0,0) a1 (0,1,1,0) a2 (1,0,0,1) so we choose vectors: a1,a2 to get full coverage with minimum vectors.

The brute force naive algorithm is to try every option and then return the solution with minimum amount of vectors, but since there are n vectors, there are 2^n options.

Next i thought of a greedy algorithm that takes each time the vector with most keys. But here is a counter example that proves this method wrong:

  • a0 (1,1,1,0,0,0) --greedy and wrong choise--
  • a1 (1,0,0,1,0,0)
  • a2 (0,1,0,0,1,0)
  • a3 (0,0,1,0,0,1)

with this algorithm, we will choose all 4 vectors, but the optimal solution has only 3 vectors {a1,a2,a3}.

Now, i thought of another greedy algorithm, we choose the vector with the most rare key (and in case of a tie, we search the next rare key and so on), but then again i found a counter example:

  • a0 (1,1,1,0,0,0) --greedy and wrong choise--
  • a1 (1,0,0,1,0,0)
  • a2 (0,1,0,0,1,0)
  • a3 (0,0,1,0,0,1)
  • a4 (0,0,0,1,0,0)
  • a5 (0,0,0,0,1,0)
  • a6 (0,0,0,0,0,1)

in this example all the indexes have the same "rarity" (2), so we end up choosing a0 although a sulotion with a0 (such as {a0,a1,a2,a3}) has at least 4 vectors and the optimal solution has only 3 {a1,a2,a3}.

I think that maybe if i would eliminate vectors that are subsets of another vector (for instance a6 is a subset of a3) then this algorithm may work, but even so, checking this would cost n!.

I hope you could help me find an efficient algorithm or help me prove that such algorithm is not existed.

Était-ce utile?

La solution

This problem is called Set Cover Problem and it is NP-Hard.

You may find this short video helpful: discrete optimization (you have to be logged into your coursera account). It comes from great course on Discrete Optimization -- the archive of the class is still opened.

(Your reasoning about pruning space-search is quite sound: you can just immediately remove all vectors that are dominated by another vector and you have to take a vector if no other vector has the same key.)

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top