문제

Just like it was asked here, I fail to understand how we can find the index of a relaxed vertex in the heap.

Programming style-wise, the heap is a black box that abstracts away the details of a priority queue. Now if we need to maintain a hash table that maps vertex keys to corresponding indices in the heap array, that would need to be done in heap implementation, right?

But most standard heaps don't provide a hash table that does such mapping.

Another way to deal with this whole problem is to add the relaxed vertices to the heap regardless of anything. When we extract the minimum we'll get the best one. To prevent the same vertex being extracted multiple times, we can mark it visited.

So my exact question is, what is the typical way (in the industry) of dealing with this problem?

What are the pros and cons compared what the methods I mentioned?

도움이 되었습니까?

해결책 2

You are asking some very valid questions but unfortunately they are kind of vague so we won't be able to give you a 100% solid "industry standard" answer. However, I'll try to go over your points anyway:

Programming style-wise, the heap is a black box that abstracts away the details of a priority queue

Technically, a priority queue is the abstract interface (insert elements with a priority, extract the lowest priority element) and a heap is a concrete implementation (array-based heap, binomial heap, fibonacci heap, etc).

What I'm trying to say is that using an array is only one particular way to implement a priority queue.

Now if we need to maintain a hash table that maps vertex keys to corresponding indices in the heap array, that would need to be done in heap implementation, right?

Yes, because everytime you move an element inside the array you will need to update the index in the hash table.

But most standard heaps don't provide a hash table that does such mapping.

Yes. This can be very annoying.

Another way to deal with this whole problem is to add the relaxed vertices to the heap regardless of anything.

I guess that could work but I dont think I ever saw anyone do that. The whole point of using a heap here is to increase performance and by adding redundant elements to the heap you kind of go against that. Sure, you preserve the "black-boxness" of the priority queue but I don't know if that is worth it. Additionally, there could be a chance that the extra pop_heap operations could negatively affect your asymptoptic complexity but I'd have to do the math to check.

what is the typical way (in the industry) of dealing with this problem?

First of all, ask yourself if you can get away with using a dumb array instead of a priority queue. Sure, finding the minimum element in now O(N) instead of O(log n) but the implementation is the simplest (an advantage on its own). Additionally, using an array will be just as efficient if your graph is dense and even if your graph is sparse it might be efficient enough depending on how big your graph is.

If you really need a priority queue, then you are going to have to find one that has a decreaseKey operation implemented. If you can't find one, I would say its not that bad to implement it yourself - it might be less trouble than trying to find an existing implementation and then trying to fit it in with the rest of your code.

Finally, I would not recommend using the really fancy heap data structures (such as fibonacci heaps). While these often show up in textbooks as a way to get optimal asymptotics, in practice they have terrible constant factors and these constant factors are significant when compared with something that is logarithmic.

다른 팁

Typically, you'd need a specially-constructed priority queue that supports the decreaseKey operation in order to get this to work. I've seen this implemented by having the priority queue explicitly keep track of a hash table of the indices (if using a binary heap), or by having an intrusive priority queue where elements stored are nodes in the heap (if using a binomial heap or Fibonacci heap, for example). Sometimes, the priority queue's insertion operation will return a pointer to the node in the priority queue that holds the newly-added key. As an example, here is an implementation of a Fibonacci heap that supports decreaseKey. It works by having each insert operation return a pointer to the node in the Fibonacci heap, which makes it possible to look up the node in O(1), assuming you keep track of the returned pointers.

Hope this helps!

Programming style-wise, the heap is a black box that abstracts away the details of a priority queue.

Not necessarily. Both C++ and Python have heap libraries that provide functions on arrays rather than black box objects. Go abstracts a bit, but requires the programmer to provide an array-like data structure for its heap operations to work on.

All this abstraction leaking in standardized, industry-strength libraries has a reason: some algorithms (Dijkstra) require a heap with additional operations, which would degrade the performance of other algorithms. Yet other algorithms (heapsort) need heap operations that work in-place on input arrays. If your library's heap gives you a black-box object, and it doesn't suffice for some algorithm, then it's time to re-implement the operations as function on arrays, or find a library that does have the operations you need.

This is a great question and one of those details that algorithms books like CLRS just glaze over without mention.

There are a few ways to do handle this, either:

  1. Use a custom heap implementation that supports decreaseKey operations
  2. Every time you "relax" a vertex, you just add it back into the heap with the new lower weight, then you write a custom way to ignore the old elements later. You can take advantage of the fact that you only ever add a node into the heap/priority-queue if the weight has decreased.

Option #1 is definitely used. For example, if you are familiar with OpenSourceRoutingMachine (OSRM) it searches over graphs with many millions of nodes to compute road routing directions. It uses a Boost implementation of a d-ary heap specifically because it has better decreaseKey operations, source. Often the Fibonacci_heap is also mentioned for this purpose because it supports O(1) decrease key operations, but likewise you'd probably have to roll your own.

In option #2 you end up doing more insertions and removeMin operations in total. If D is the total number of "relax" operations you must do, you end up doing a total of D additional heap operations. So while this has a theoretically worse runtime complexity, in practice there is research evidence that option #2 can be more performant because you can take advantage of cache locality and avoid the additional overhead of keeping pointers to do the decreaseKey operations (see [1], specifically pg. 16). This approach also has the advantage of being simpler and allows you to use standard library heap/priority-queue implementations in most languages.

To give you some psuedocode for how option #2 would look:

// Imagine this is some lookup table that has the minimum weight
// so far for each node.
weights = {}

while Queue is not empty:
    u = Queue.removeMin()

    // This is our new logic to discard the duplicate entries.
    if u.weight > weights[u]:
        continue

    visit neighbors[u] and relax() each one

As an alternative, you can also check out the the Python standard library heapq docs which describe another approach to keeping track of "dead" entries in the heap. Whether you find it helpful depends on what data structure you are using for your graph representation and storing of vertex distances.


[1] Priority Queues and Dijkstra’s Algorithm 2007

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top