Question

I read in many places that some problems are difficult to approximate (it is NP-hard to approximate them). But approximation is not a decision problem: the answer is a real number and not Yes or No. Also for each desired approximation factor, there are many answers that are correct and many that are wrong, and this changes with the desired approximation factor!

So how can one say that this problem is NP-hard?

(inspired by the second bullet in How hard is counting the number of simple paths between two nodes in a directed graph?)

Was it helpful?

Solution

As you said, there is no decision to make, so new complexity classes and new types of reductions are needed to arrive at a suitable definition of NP-hardness for optimization-problems.

One way of doing this is to have two new classes NPO and PO that contain optimizations problems and they mimic of course the classes NP and P for decision problems. New reductions are needed as well. Then we can recreate a version of NP-hardness for optimization problems along the lines that was successful for decision problems. But first we have to agree what an optimization-problem is.

Definition: Let $O=(X,L,f,opt)$ be an optimization-problem. $X$ is the set of inputs or instances suitable encoded as strings. $L$ is a function that maps each instance $x\in X$ onto a set of strings, the feasible solutions of instance $x$. It is a set because there are many solutions to an optimization-problem. Thus we haven an objective function $f$ that tells us for every pair $(x, y)$ $y\in L(x)$ of instance and solution its cost or value. $opt$ tells us whether we are maximizing or minimizing.

This allows us to define what an optimal solution is: Let $y_{opt}\in L(x)$ be the optimal solution of an instance $x\in X$ of an optimization-problem $O=(X,L,f,opt)$ with $$f(x,y_{opt})=opt\{f(x,y')\mid y'\in L(x)\}.$$ The optimal solution is often denoted by $y^*$.

Now we can define the class NPO: Let $NPO$ be the set of all optimization-problems $O=(X,L,f,opt)$ with:

  1. $X\in P$
  2. There is a polynomial $p$ with $|y|\le p(|x|)$ for all instances $x\in X$ and all feasible solutions $y\in L(x)$. Furthermore there is an deterministic algorithm that decides in polynomial time whether $y\in L(x)$.
  3. $f$ can be evaluated in polynomial time.

The intuition behind it is:

  1. We can verify efficiently if $x$ is actually a valid instance of our optimization problem.
  2. The size of the feasible solutions is bounded polynomially in the size of the inputs, And we can verify efficiently if $y\in L(x)$ is a fesible solution of the instance $x$.
  3. The value of a solution $y\in L(x)$ can be determined efficiently.

This mirrors how $NP$ is defined, now for PO: Let $PO$ be the set of all problems from $NPO$ that can be solved by an deterministic algorithm in polynomial time.

Now we are able to define what we want to call an approximation-algorithm: An approximation-algorithm of an optimization-problem $O=(X,L,f,opt)$ is an algorithm that computes a feasible solution $y\in L(x)$ for an instance $x\in X$.

Note: That we don’t ask for an optimal solution we only what to have a feasible one.

Now we have two types of errors: The absolute error of a feasible solution $y\in L(x)$ of an instance $x\in X$ of the optimization-problem $O=(X,L,f,opt)$ is $|f(x,y)-f(x,y^*)|$.

We call the absolute error of an approximation-algorithm $A$ for the optimization-problem $O$ bounded by $k$ if the algorithm $A$ computes for every instance $x\in X$ a feasible solution with an absolute error bounded by $k$.

Example: According to the Theorem of Vizing the chromatic index of a graph (the number of colours in the edge coloring with the fewest number of colors used) is either $\Delta$ or $\Delta+1$, where $\Delta$ is the maximal node degree. From the proof of the theorem an approximation-algorithm can be devised that computes an edge coloring with $\Delta+1$ colours. Accordingly we have an approximation-algorithm for the $\mathsf{Minimum-EdgeColoring}$-Problem where the absolute error is bounded by $1$.

This example is an exception, small absolute errors are rare, thus we define the relative error $\epsilon_A(x)$ of the approximation-algorithm $A$ on instance $x$ of the optimization-problem $O=(X,L,f,opt)$ with $f(x,y)>0$ for all $x\in X$ and $y\in L(x)$ to be

$$\epsilon_A(x):=\begin{cases}0&f(x,A(x))=f(x,y^*)\\\frac{|f(x,A(x))-f(x,y^*)|}{\max\{f(x,A(x)),f(x,y^*)\}}&f(x,A(x))\ne f(x,y^*)\end{cases}$$

where $A(x)=y\in L(x)$ is the feasible solution computed by the approximation-algorithm $A$.

We can now define approximation-algorithm $A$ for the optimization-problem $O=(X,L,f,opt)$ to be a $\delta$-approximation-algorithm for $O$ if the relative error $\epsilon_A(x)$ is bounded by $\delta\ge 0$ for every instance $x\in X$, thus $$\epsilon_A(x)\le \delta\qquad \forall x\in X.$$

The choice of $\max\{f(x,A(x)),f(x,y^*)\}$ in the denominator of the definition of the relative error was selected to make the definition symmetric for maximizing and minimizing. The value of the relative error $\epsilon_A(x)\in[0,1]$. In case of a maximizing problem the value of the solution is never less than $(1-\epsilon_A(x))\cdot f(x,y^*)$ and never larger than $1/(1-\epsilon_A(x))\cdot f(x,y^*)$ for a minimizing problem.

Now we can call an optimization-problem $\delta$-approximable if there is a $\delta$-approximation-algorithm $A$ for $O$ that runs in polynomial time.

We do not want to look at the error for every instance $x$, we look only at the worst-case. Thus we define $\epsilon_A(n)$, the maximal relativ error of the approximation-algorithm $A$ for the optimization-problem $O$ to be $$\epsilon_A(n)=\sup\{\epsilon_A(x)\mid |x|\le n\}.$$

Where $|x|$ should be the size of the instance.

Example: A maximal matching in a graph can be transformed in to a minimal node cover $C$ by adding all incident nodes from the matching to the vertex cover. Thus $1/2\cdot |C|$ edges are covered. As each vertex cover including the optimal one must have one of the nodes of each covered edge, otherwise it could be improved, we have $1/2\cdot |C|\cdot f(x,y^*)$. It follows that $$\frac{|C|-f(x,y^*)}{|C|}\le\frac{1}{2}$$ Thus the greedy algorithm for a maximal matching is a $1/2$-approximatio-algorithm for $\mathsf{Minimal-VertexCover}$. Hence $\mathsf{Minimal-VertexCover}$ is $1/2$-approximable.

Unfortunately the relative error is not always the best notion of quality for an approximation as the following example demonstrates:

Example: A simple greedy-algorithm can approximate $\mathsf{Minimum-SetCover}$. An analysis shows that $$\frac{|C|}{|C^*|}\le H_n\le 1+\ln(n)$$ and thus $\mathsf{Minimum-SetCover}$ would be $\frac{\ln(n)}{1+\ln(n)}$-approximable.

If the relative error is close to $1$ the following definition is advantageous.

Let $O=(X,L,f,opt)$ be an optimization-problem with $f(x, y)>0$ for all $x\in X$ and $y\in L(x)$ and $A$ an approximation-algorithm for $O$. The approximation-ratio $r_A(x)$ of feasible solution $A(x)=y\in L(x)$ of the instance $x\in X$ is $$r_A(x)=\begin{cases}1&f(x,A(x))=f(x,y^*)\\\max\left\{ \frac{f(x,A(x))}{f(x, y^*)},\frac{f(x, y^*)}{f(x, A(x))}\right\}&f(x,A(x))\ne f(x,y^*)\end{cases}$$

As before we call an approximation-algorithm $A$ an $r$-approximation-algorithm for the optimization-problem $O$ if the approximation-ratio $r_A(x)$ is bounded by $r\ge1$ for every input $x\in X$. $$r_A(x)\le r$$ And yet again if we have an $r$-approximation-algorithm $A$ for the optimization-problem $O$ then $O$ is called $r$-approximable. Again we only care about to the worst-case and define the maximal approximation-ratio $r_A(n)$ to be $$r_A(n)=\sup\{r_A(x)\mid |x|\le n\}.$$ Accordingly the approximation-ratio is larger than $1$ for suboptimal solutions. Thus better solutions have smaller ratios. For $\mathsf{Minimum-SetCover}$ we can now write that it is $(1+\ln(n))$-approximable. And in case of $\mathsf{Minimum-VertexCover}$ we know from the previous example that it is $2$-approximable. Between relative error and approximation-ratio we have simple relations: $$r_A(x)=\frac{1}{1-\epsilon_A(x)}\qquad \epsilon_A(x)=1-\frac{1}{r_A(x)}.$$

For small deviations from the optimum $\epsilon<1/2$ and $r<2$ the relative error is advantageous over the approximation-ratio, that shows its strengths for large deviations $\epsilon\ge 1/2$ and $r\ge 2$.

The two versions of $\alpha$-approximable don’t overlap as one version has always $\alpha\le 1$ and the other $\alpha\ge 1$. The case $\alpha=1$ is not problematic as this is only reached by algorithms that produce an exact solution and consequentially need not be treated as approximation-algorithms.

Another class appears often APX. It is define as the set of all optimization-problems $O$ from $NPO$ that haven an $r$-approximation-algorithm with $r\ge1$ that runs in polynomial time.

We are almost through. We would like to copy the successful ideas of reductions and completness from complexity theory. The observation is that many NP-hard decision variants of optimization-problems are reducible to each other while their optimization variants have different properties regarding their approximability. This is due to the polynomialtime-Karp-reduction used in NP-completness reductions, which does not preserve the objective function. And even if the objective functions is preserved the polynomialtime-Karp-reduction may change the quality of the solution.

What we need is a stronger version of the reduction, which not only maps instances from optimization-problem $O_1$ to instances of $O_2$, but also good solutions from $O_2$ back to good solutions from $O_1$.

Hence we define the approximation-preserving-reduction for two optimization-problems $O_1=(X_1,L_1,f_1,opt_1)$ and $O_2=(X_2,L_2,f_2,opt_2)$ from $NPO$. We call $O_1$ $AP$-reducible to $O_2$, written as $O_1\le_{AP} O_2$, if there are two functions $g$ and $h$ and a constant $c$ with:

  1. $g(x_1, r)\in X_2$ for all $x_1\in X_1$ and rational $r>1$
  2. $L_2(g(x, r_1))\ne\emptyset$ if $L_1(x_1)\ne\emptyset$ for all $x_1\in X_1$ and rational $r>1$
  3. $h(x_1, y_2, r)\in L_1(x_1)$ for all $x_1\in X_1$ and rational $r>1$ and for all $y_2\in L_2(g(x_1,r))$
  4. For fixed $r$ both functions $g$ and $h$ can be computed by two algorithms in polynomial time in the length of their inputs.
  5. We have $$f_2(g(x_1,r),y_2)\le r \Rightarrow f_1(x_1,h(x_1,y_2,r))\le 1+c\cdot(r-1) $$ for all $x_1\in X_1$ and rational $r>1$ and for all $y_2\in L_2(g(x_1,r))$

In this definition $g$ and $h$ depend on the quality of the solution $r$. Thus for different qualities the functions can differ. This generality is not always needed and we just work with $g(x_1)$ and $h(x_1, y_2)$.

Now that we have a notion of a reduction for optimization-problems we finally can transfer many things we know from complexity theory. For example if we know that $O_2\in APX$ and we show that $O_1\le_{AP} O_2$ it follows that $O_1\in APX$ too.

Finally we can define what we mean by $\mathcal{C}$-hard and $\mathcal{C}$-complete for optimization-problems:

Let $O$ be an optimization-problem from $NPO$ and $\mathcal{C}$ a class of optimization-problems from $NPO$ then $O$ is called $\mathcal{C}$-hard with respect to $\le_{AP}$ if for all $O'\in\mathcal{C}$ $O'\le_{AP} O$ holds.

Thus once more we have a notion of a hardest problem in the class. Not surprising a $\mathcal{C}$-hard problem is called $\mathcal{C}$-complete with respect to $\le_{AP}$ if it is an element of $\mathcal{C}$.

Thus we can now talk about $NPO$-completness and $APX$-completness etc. And of course we are now asked to exhibit a first $NPO$-complete problem that takes over the role of $\mathsf{SAT}$. It comes almost naturally, that $\mathsf{Weighted-Satisfiability}$ can be shown to be $NPO$-complete. With the help of the PCP-Theorem one can even show that $\mathsf{Maximum-3SAT}$ is $APX$-complete.

OTHER TIPS

Usually what's shown is the NP-hardness of a "Gap" version of the problem. For example, suppose that you want to show that it's hard to approximate SET COVER to within a factor of 2.

You define the following "promise" instance of SET COVER that we'll call 2-GAP-SET-COVER:

Fix some number $\ell$. 2-GAP-SET-COVER consists of all instances of set cover where the size of the optimal set cover is either:

  • at most $\ell$
  • at least $2\ell$

Suppose we show that the problem of deciding which of the two cases a problem falls into is NP-complete. Then we've shown that approximating SET COVER to within a factor of 2 is NP-hard, because we could use such an algorithm to distinguish these two cases.

The two existing answers are very informative but I don't think either of them really answers the question, which is, "How can a problem that isn't even a decision problem be NP-hard, when NP is a class of decision problems?"

The answer is to remember what NP-hard means. A problem $L$ is NP-hard under some kind of reductions if every problem in NP can be reduced to $L$. (And $L$ is NP-complete if it's NP-hard and in NP.) Informally, NP-hard means "If I could solve this problem, I could solve everything in NP", even if the problem you're talking about isn't in NP. NP-hardness does not require membership of NP, but it does need the right notion of reduction.

Some examples.

  1. Any NEXPTIME-complete problem $L$ is NP-hard under polynomial-time many-one reductions. Any problem in NP is in NEXPTIME so is reducible to $L$ by definition. By the time hierarchy theorem, $L$ cannot be in NP, so $L$  is not NP-complete.
  2. #SAT is the problem of computing the number of satisfying assignments to CNF formulas. It is clearly not in NP because, as you observe, NP is a class of decision problems and #SAT is not one of those. However, #SAT is NP-hard under polynomial-time Turing reductions because we can reduce SAT to it. Given a SAT instance, we ask how many satisfying assignments there are: if there is at least one, we say "satisfiable"; otherwise, "unsatisfiable".
  3. Let APPROXSAT be the problem of computing a number which is within a factor of ten of the number of satisfying assignments to a CNF formula $\varphi$. Just to be annoying, let's say you're allowed to round down so, if $\varphi$ has, say, three satisfying assignments, the algorithm is allowed to think "0.3" and round that down to zero. This is NP-hard under polynomial-time Turing reductions because we can still reduce SAT to it. Given a CNF formula $\varphi$, ask for the number of satisfying assignments to $\varphi'=\varphi \wedge (Z_1\vee \dots \vee Z_{10})$, where the $Z_i$ are new variables. $\varphi'$ is satisfiable if, and only if, $\varphi$ is, but $\varphi'$ is guaranteed to have more than 1,000 satisfying assignments, if it has any. So, $\varphi$ is satisfiable if, and only if, the APPROXSAT algorithm says that $\varphi'$ has at least 100 satisfying assignments.
Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top