Frage

I've taken a look at this questions , but I still don't see the difference between a Suffix tree and a Trie .

Both have all the substrings of a given string , so where do they differ from one another ?

War es hilfreich?

Lösung

Suffix tree - a large text is given. Query - search many times any words in the text.
Example: You are implementing your own cool text redactor with solitaire and kittens=) You are going to implement CTRL+F function. Probable implementation - index document (create suffix tree) and when user looks for some word - search it in a tree.

Trie - a large text is given. Query - search many times predefined words in the text.
Example: You are implementing your own cool facebook with poker and Justin Bieber's fans=) You don't want your users to post swear words. Probable implementation - create trie of swear words. When users types some text search forbiiden words and replace them with *.

In general, suffix tree= trie. Suffix tree is a trie of all suffixes of some word. When you want search something in a dictionary use trie. When you search something in a solid text use suffix tree.

Important note - building/rebuilding suffix tree for large text is complex operation. Once you changed your text you have to recreate suffix tree. Rebuilding trie is a trivial operation - just add new word in O(wordLength)

Conclusion
Suffix tree. You know nothing about future queries.Spend time once for creating suffix tree and you are ready to process requests. Known info is a text. Situations where requests are unknown but text is given and not going to change oftenly are candidates for using suffix tree. For example you can't use trie (aho-corasick algo) in CTRL+F implementation - because you just can't give dictionary as the input for aho-corasick's algo based on the trie.

Trie. You know nothing about text where you will perform searching. But you know future queries. Spend time for preprocessing/preparing data structure for your queries and you can perform search queries in any text. For example in replacing forbidden words task you don't know what text user will post but you know forbidden words. Creating suffix tree for each short new post would be too stupid=) UPD As @mightyWOZ noticed in comment, pure trie is not applicable but we can use Aho-Corasick algo which is extension over trie. So, statement is still true for tries - there exist approach (Aho-Corasick) which uses trie as a base, pre-processes queries and then can handle any text.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top