Is there an algorithm to share parts of strings? [duplicate]

https://cs.stackexchange.com/questions/113025

05-11-2019
|

Pergunta

This question already has an answer here:

Finding the smallest string that contains a given set of substrings 1 answer

Let's say that we have the strings Hello, World, Hello and World. The algorithm I am thinking of would create an immutable buffer that contains Hello, World and the previous three would then become references or slices of that one like buf[0,11], buf[0,4], buf[7,11].

A way to implement this algorithm would be to compare all new strings to the existing buffer and reuse it if possible or append the new string if that's not the case. This would be very slow, but it should work. However, it does not efficiently handle a case like Hey, Hello because it would either append it to the buffer, adding Hello again, or add Hey, to the beginning and update all the indices.

If saving space is the optimal solution, then the final buffer would be Hey, Hello, World.

Is there any algorithm that does something similar?

I have read about ropes, and read this article, but I do not see how these would help.

EDIT: this is indeed a shortest superstring problem, but the implementations I have found (1,2) assume that no string in the array could be a substring of another one, which is not the case here.

Nenhuma solução correta

Licenciado em: CC-BY-SA com atribuição

Não afiliado a cs.stackexchange