Question

In this paper on PDF page 3, they give $\{c (ab)^k, (ba)^k, (ab)^k c\}$ as an example of an input to the GREEDY algorithm for shortest superstring that causes it to have an approximation ratio of 2.

As far as I can tell, the shortest superstring of that set is either $c (ab)^k c (ba)^k$ or $b (ab)^k c (ab)^k$ -- either way, the length is $4k + 2$. Unfortunately, this contradicts the claim in the paper that GREEDY outputs a string of length 2 times the optimal superstring.

  • Without loss of generality, we can assume that GREEDY will never output a string longer than the concatenation of its input set -- if the GREEDY algorithm really is described in that way, we can just add an if statement to check that case and return the concatenation instead.
  • The length of the concatenation of the input strings is $6k + 2$.
  • For all $k \in \mathbb{N}$, $1 \le \frac{6k + 2}{4k + 2} \le 1.5$
  • Therefore either the approximation ratio for any reasonable algorithm on this class of problems must be less than or equal to 1.5, or what I think is the shortest superstring is not actually the shortest one.

Can someone either explain why this line of argumentation doesn't work, or say what the actual shortest superstring of this set is?

Was it helpful?

Solution

A shorter superstring is $c(ab)^{k+1}c$, which has length $2k+4$.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top