Is there an algorithm to share parts of strings? [duplicate]
-
05-11-2019 - |
Pergunta
This question already has an answer here:
Let's say that we have the strings Hello, World
, Hello
and World
. The algorithm I am thinking of would create an immutable buffer that contains Hello, World
and the previous three would then become references or slices of that one like buf[0,11], buf[0,4], buf[7,11]
.
A way to implement this algorithm would be to compare all new strings to the existing buffer and reuse it if possible or append the new string if that's not the case. This would be very slow, but it should work. However, it does not efficiently handle a case like Hey, Hello
because it would either append it to the buffer, adding Hello
again, or add Hey,
to the beginning and update all the indices.
If saving space is the optimal solution, then the final buffer would be Hey, Hello, World
.
Is there any algorithm that does something similar?
I have read about ropes, and read this article, but I do not see how these would help.
EDIT: this is indeed a shortest superstring problem, but the implementations I have found (1,2) assume that no string in the array could be a substring of another one, which is not the case here.
Nenhuma solução correta