Question

In Sections 5.1 of The Design of Approximation Algorithms by Williamson and Shmoys, they describe a basic randomized algorithm for MAX SAT and how to derandomize it. The algorithm is just to assign each variable 1 (true) with probability 1/2 and 0 (false) with probability 1/2. In other words, sample uniformly at random from the space of all solutions. They show that this is a 1/2-approximation.

Then in Section 5.2, they describe how to derandomize it using the method of conditional expectations. (I won't describe the process here because it is not very complex and widely known I'm assuming.)

My question is, why bother derandomizing this way? Or even, why bother making the algorithm random in the first place?

It seems to me that an equally good algorithm would be the one-liner which deterministically sets all variables to 1. Given some MAX SAT instance as input, it seems to me that you would also expect this to (i.e., "in expectation it would") satisfy half of the clauses. To me, the analysis of the random algorithm really seems to say that any fixed guess is "good." (Rather than showing that our random algorithm is inherently good.) So why go through the process of randomizing and derandomizing in the first place?

Thanks in advance!

Was it helpful?

Solution

The difference is that the randomized algorithm guarantees an expected 1/2-approximation on any input. By contrast, it is easy for an adversary to construct an input (i.e. an instance of MAX-SAT) for which the deterministic "set all variables to true" algorithm satisifes zero clauses.

Remember that the sample space for a randomized algorithm is over a set of auxiliary random bits. There is no probability distribution assumed over the inputs. The typical goal of randomized algorithm design is for every input to be handled well in expectation. (Analyzing algorithm behavior over an assumed input distribution is called average-case analysis instead.)

What are auxiliary random bits?

Suppose we have a randomized Turing machine $M_1$ that runs on instances of length $n$ for no more than $T(n)$ time, during which it makes no more than $R(n) \le T(n)$ random decisions. We can turn this machine into a deterministic Turing machine $M_2$ that has two input tapes: the usual tape that contains the input string $x$ of length $n$, and a tape that contains a string $r$ of length $R(n)$. The string $r$ is our string of auxiliary random bits; it determines which "random" decisions the Turing machine is to make. When we say that the randomized Turing machine run $M_1(x)$ accepts with probability $p$, this is equivalent to saying that the set $$A(x) = \left\{r\ |\ r \in \{0, 1\}^{R(|x|)}, M_2(x, r)\text{ accepts}\right\}$$ of $r$ strings that make $M_2(x, r)$ accept constitutes a fraction $p = |A(x)| / 2^{|x|}$ of the set of all $r$ strings.

You might recognize what is going on here if you've seen the analogous construction for nondeterministic Turing machines. We can think of an NP machine as a nondeterministic machine that branches into exponentially many copies of itself. But we can also think of it as a deterministic verifier machine that requires both an input and a "proof" string, with the acceptance criteria that an input string is in the language if any proof string makes the machine accept.

It is often easier to think about this down-to-Earth concept of deterministic verifier machines and which subset of proof strings makes the machine accept on a given input, rather than thinking about very abstract ideas like exponentially branching machines and possible worlds. And it makes it easier to define complexity classes such as co-NP, PP, BPP, ⊕P, etc., all of which are essentially "NP with a different acceptance rule." For example:

  • NP is the set of languages $L$ for which there exists a polynomial-time verifier machine $M_2$ such that $x \in L$ if and only if there exists an $r$ string such that $M_2(x, r)$ accepts (where the length of the $r$ string is bounded by a polynomial $R(|x|)$).
  • BPP is the set of languages $L$ for which there exists a polynomial-time verifier machine $M_2(x, r)$ such that $x \in L$ implies that $M_2(x, r)$ accepts for at least ⅔ of $r$ strings and $x \notin L$ implies that $M_2(x, r)$ accepts for at most ⅓ of $r$ strings (where the length of the $r$ strings is bounded by a polynomial $R(|x|)$).

Note: It mostly doesn't matter whether we require the $r$ strings to have length exactly $R(n)$ or at most $R(n)$, since allowing shorter strings only increases the number of possible strings by a constant factor.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top