Frage

One of the biggest advantages of using heaps for priority queues (as opposed to, say, red-black trees) seems to be space-efficiency: unlike balanced BSTs, heaps only require O(1) auxiliary storage.
i.e, the ordering of the elements alone is sufficient to satisfy the invariants and guarantees of a heap, and no child/parent pointers are necessary.

However, my question is: is the above really true?

It seems to me that, in order to use an array-based heap while satisfying the O(log n) running time guarantee, the array must be dynamically expandable.

The only way we can dynamically expand an array in O(1) time is to over-allocate the memory that backs the array, so that we can amortize the memory operations into the future.

But then, doesn't that overallocation imply a heap also requires auxiliary storage?

If the above is true, that seems to imply heaps have no complexity advantages over balanced BSTs whatsoever, so then, what makes heaps "interesting" from a theoretical point of view?

War es hilfreich?

Lösung

You appear to confuse binary heaps, heaps in general, and implementations of binary heaps that use an array instead of an explicit tree structure.

A binary heap is simply a binary tree with properties that make it theoretically interesting beyond memory use. They can be built in linear time, whereas building a BST necessarily takes n log n time. For example, this can be used to select the k smallest/largest values of a sequence in better-than-n log n time.

Implementing a binary heap as an array yields an implicit data structure. This is a topic formalized by theorists, but not actively pursued by most of them. But in any case, the classification is justified: To dynamically expand this structure one indeed needs to over-allocate, but not every data structure has to grow dynamically, so the case of a statically-sized pre-allocated data structure is interesting as well.

Furthermore, there is a difference between space needed for faster growth and space needed because each element is larger than it has to be. The first can be avoided by not growing, and also reduced to arbitrarily small constant factor of the total size, at the cost of a greater constant factor on running time. The latter kind of space overhead is usually unavoidable and can't be reduced much (the pointers in a tree are at least log n bits, period).

Finally, there are many heaps other than binary heaps (fibonacci, binominal, leftist, pairing, ...), and almost all except binary heaps offer better bounds for at least some operations. The most common ones are decrease-key (alter the value of a key already in the structure in a certain way) and merge (combined two heaps into one). The complexity of these operations are important for the analysis of several algorithms using priority queues, and hence the motivation for a lot of research into heaps.

In practice the memory use is important. But (with over-allocation and without), the difference is only a constant factor overall, so theorists are not terribly interested in binary heaps. They'd rather get better complexity for decrease-key and merge; most of them are happy if the data structure takes O(n) space. The extremely high memory density, ease of implementation and cache friendliness are far more interesting for practitioners, and it's them who sing the praise of binary heaps far and wide.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top