Pergunta

I am wondering about how range queries work, and the standard solution is to use B+trees. However, I am a fan of tries as a general data structure and would like to know if they (or variations of them) can do all the things B+trees can do. That is, if they can support fast range/between queries on-par with B+trees, and if there is any other limitation of tries compared to B+trees.

Foi útil?

Solução

A B-Tree is a balanced tree structure that ensures a search in logarithmic time and efficient ordered traversal. Each node has more than two children. This helps to achieve a higher performance than binary trees, because it reduces the number of disk accesses and their storage layout can take advantage of block devices.

A trie on the other hand is a structure that good at exact matches. The complexity is proportional to the size of the key. You may implement an ordered traversal, if you take the precaution of ordering the children of each node. Storage of large tries is the weak point. Either you use variable size lists for the children (and then it's costly to add a new item, especially on disks), or you preallocate allocate for each node space for every possible letter of the alphabet. So it's exponential consumption of storage (acceptable for ascii, but a bad idea for unicode).

You may be interested in a relatively recent research about a combined structure called b-trie, for which algorithms exist to make them more suitable for disk storage.

Licenciado em: CC-BY-SA com atribuição
scroll top