Difference between Tries and Trees?

https://stackoverflow.com/questions/4737904

tree
trie

12-10-2019
|

Question

I remotely remember that tries don't store the whole data per node, only the suffix to the parent node.

Where trees do store the whole data, but only organize themselves based on a prefix.

So tries get smaller, which allows for example to compress dictionaries very well.

Is that really the only difference?

From actual applications I remember that tries are faster in range queries. There are even special solr/lucene trie fields to speed up range queries. But how is that so?

What is the actual difference and what are the advantages and disadvantages of tries and trees?

Solution

A tree is a general structure of recursive nodes. There are many types of trees. Popular ones are binary tree and balanced tree. A Trie is a kind of tree, known by many names including prefix tree, digital search tree, and retrieval tree (hence the name 'trie').

Each kind of tree has a different purpose, structure and behaviour. For example, a binary tree stores a collection of comparable items (eg numbers). It can therefore be used to store a set of numbers, or to index other data that can be represented by numbers (eg objects that can be hashed). Its structure is sorted so it can be searched quickly to find a single item. Other tree structures, such as a balanced tree, are similar in principle.

A trie represents a sequence in its structure. It is very different in that it stores sequences of values rather than individual single values. Each level of recursion says 'what is the value of item I of the input list'. This is different to a binary tree which compares the single searched value to each node.

OTHER TIPS

A binary tree or a bst is typically used to store numerical values. The time complexity in a bst is O(log(n)) for insertion, deletion and searching. Each node in a binary tree has at most 2 child nodes.

Trie : Every node of trie consists of multiple branches. Each branch represents a possible character of keys. We need to mark the last node of every key as leaf node. A trie node field value will be used to distinguish the node as leaf node (there are other uses of the value field)

To learn about tries refer this topcoder tutorial. https://www.topcoder.com/community/data-science/data-science-tutorials/using-tries/

Just got some insights from this talk, even through the Radix tree used in linux kernel is slight different to the one on wikipedia.

Trees only store keys, they don't store the value associated with the keys.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow