An Exact Method for Solving Small Instances of XORSAT

https://cs.stackexchange.com/questions/119999

satisfiability

28-09-2020
|

Pergunta

I have a couple of small instances for XORSAT for which I am to design and implement an exact method. However, there are a few catches. It is guaranteed that there always exists and answer but I need to find the one solution with the minimal (non-zero) number of TRUE variables.

As I understand, XORSAT can apparently be solved in polynomial time through the Gaussian Elimination method but I simply cannot apply the special conditions of this problem (the minimal number of TRUE variables) and I'm wondering if I could potentially get some help on that.

The instances of XORSAT that I have are not completely random, they all consist of a number of variables that if XOR'd together will always yield FALSE. I.e., all of the clauses are in the form of $x_1 \oplus x_2 \oplus ... \oplus x_i = 0$ which would translate into the equation $x_1 + x_2 + ... + x_i = 0\space(mod 2)$.

At first, I tried to apply a greedy method, taking the two most repeated variables in all of the clauses and forcing the value TRUE on them, then propagating their values through all of the clauses and repeating the same method on the remaining unsatisfied clauses but that was fundamentally flawed.

Any help is tremendously appreciated.

Solução

If your formula has $n$ variables, for each value of $k$ from 1 to $n$, add an exactly-$k$ cardinality constraint to your SAT instance and then run a SAT solver on each result. The solver will only emit a solution with exactly $k$ variables set TRUE and will return UNSAT if no such solution is possible. As soon as the solver returns something other than UNSAT for an instance containing an exactly-$k$ constraint you have your answer. At worst you will need $n$ runs of the solver.

If you want to get fancy, you can use binary search to probe the $k$ values until you bracket the minimal one and you'll only need $O(\log n)$ solver runs.

Outras dicas

I'll assume all of the equations take the form $x_i + \cdots + x_j = 0$ (i.e., always 0 on the right-hand side). Then, your problem is: given a boolean matrix $M$, find a $x$ such that $Mx=0$ (modulo 2) that minimizes the Hamming weight of $x$.

I think this is equivalent to the problem of finding the minimum-weight codeword of a linear code. That problem is NP-hard, and hard to approximate. The equivalence: we think of $M$ as the parity-check matrix of a linear code; then $x$ is a codeword iff $Mx=0$, so your goal is to find a minimum-weight codeword. I encourage you to check my reasoning here.

There is a brute-force algorithm that runs in $O({n \choose w})$ time, where $n$ is the number of boolean variables and $w$ is the weight of the solution. Assume that $w$ is known (in practice we can search over increasing values for $w$, as there aren't too many possibilities for it). Then, you simply enumerate all ways to choose a subset of $w$ of the variables to set to true and check whether any of them work.

I think there is a meet-in-the-middle algorithm that takes exponential time but is faster than enumerating all possibilities. The algorithm takes $O({n/2 \choose w/2} \sqrt{w})$ time and space, which is significantly better than the brute-force algorithm. We'll write $x=u+v$, where $u$ is zero in its right $n/2$ bits and $v$ is zero in its left $n/2$ bits. Then $Mx=0$ is equivalent to $Mu=Mv$. We'll optimistically hope that $u$ has weight $w/2$ and $v$ has weight $w/2$; as I argue later, this happens with good probability. So, we enumerate all ${n/2 \choose w/2}$ values of $u$ and store $Mu$ in a hashtable. Then, for each of the ${n/2 \choose w/2}$ possible values of $v$, we compute $Mv$ and look it up in the hashtable. If we find a match $Mu=Mv$, we've found solution $x=u+v$. What's the success probability of this algorithm? It is ${n/2 \choose w/2}^2 / {n \choose w} \approx \sqrt{2/(\pi w)}$. Thus, if we repeat the algorithm about $\sqrt{\pi w/2} = O(\sqrt{w})$ times, with a different split of the $n$ coordinates into two equal-size subsets, we expect to find one solution. Check my reasoning here -- this is from memory, so I can't vouch for this.

I believe there are algorithms in the literature that improve upon this scheme. See, e.g., A new algorithm for finding minimum-weight words in a linear code: application to McEliece’s cryptosystem and to narrow-sense BCH codes of length 511 by Canteaut and Chabaud. I think there has been more recent work but I'm not up to speed on it.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a cs.stackexchange