Question

A lot of articles say that hash tree traversal cost to any randomly chosen leaf is $\mathcal{O}(\log_2 N)$ ($N$ is a number of leafs) and that is right. If we have a tree of 8 leafs it will take us at most 3 operations to get to any leaf, if we have a tree of 64 leafs it will take us at most 5 operations etc.

But lets say I need to check every leaf sequentially to check if all blocks of a file are correct, then I would need $\mathcal{O}(N \log_2 N)$ operations. Or if I would check every second leaf (just left leaf of every pair) I would need $\mathcal{O}((\frac{N\log_2 N}{2}))$ operations. That is, I will need $\mathcal{O}(\log_2 N)$ operations for every leaf? Which leads to exponentially growing evaluations curve and it would be better to use simple hash list or hash chain? Am I right?

Or I just don't see/know something?

*Note, chart has logarithmic scale

Was it helpful?

Solution

You want to access every leaf in a tree, you probably do not want to start from the root every time. If you would then you need indeed $O(N \log N)$ time. If you only want to access all the leaves in the tree, do a depth-first-search traversal and you are done in $O(N)$. If you only want to have every other leaf, do the same, but ignore the "unwanted" elements.

In data structure design one often takes the leaves and put them (in order) in a double linked list. This only costs a constants factor more place, but speeds up sequential access.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top