Question

I am implementing the Fast Marching algorithm, which is some kind of continuous Dijkstra. As I read in many papers, the Fibonacci heap is the most adequate heap for this purpose.

However, when profiling with callgrind my code I see that the following function is taking 58% of the execution time:

int popMinIdx () {
    const int idx = heap_.top()->getIndex();
    heap_.pop();
    return idx; 
}

Concretely, the pop() is taking 57.67% of the whole execution time.

heap_is defined as follows:

boost::heap::fibonacci_heap<const FMCell *, boost::heap::compare<compare_cells>> heap_;

Is it normal that it takes "that much" time or is there something I can do to improve performance?

Sorry if not enough information is given. I tried to be as brief as possible. I will add more info if needed.

Thank you!

Was it helpful?

Solution 2

The Fibanacci Heap's pop() has an amortized runtime of O(log n) and worst case of O(n). If your heap is large, it could easily be consuming a majority of the CPU time in your algorithm, especially since most of the other operations you're likely using have O(1) runtimes (insert, top, etc.)

One thing I'd recommend is to try callgrind with your preferred optimization level (such as -O3) with debug info (-g), because the templatized datastructures/containers such as the fibonacci_heap are heavy on the inlined function usage. It could be that most of the CPU cycles you're measuring don't even exist in your optimized executable.

OTHER TIPS

The other answers aren't mentioning the big part: of course pop() takes the majority of your time: it's the only function that performs any real work!

As you may have read, the bounds on the operations of a Fibonacci Heap are amortized bounds. This means that if you perform enough operations in a good sequence, the bounds will average out to that. However, the actual costs are completely hidden.

Every time you insert an element, nothing happens. It is just thrown into the root list. Boom, O(1) time. Every time you merge two trees, its root is just linked into the root list. Boom, O(1) time. But hold on, your structure is not a valid Fibonacci Heap! That's where pop() (or extract-root) comes in: every time this operation is called, the entire Heap is restructured back into a correct shape. The Root is removed, its children are cut to the root list, and then we start merging trees in the root list so that no two trees with the same degree (number of children) exist in the root list.

So all of the work of Insert(e) and Merge(t) is actually delayed until Pop() is called, which then does all the work. What about the other operations?

Delete(e) is beautiful. We perform Decrease-Key(e, -inf) to make the element e become the root. And now we perform Pop()! Again, the work is done by Pop().

Decrease-Key(e, v) does its work by itself: it cuts e to the root list and starts a cutting cascade to put its children into the root list as well (which can cut their childlists too). So Decrease-Key puts a whole lot of elements into the root list. Can you guess which function has to fix that?

TL;DR: Pop() is the work horse of the Fibonacci Heap. All other operations are done efficiently because they create work for the Pop() operation. Pop() gathers the work and performs it in one go (which can take up to O(n)). This is actually really efficient because the "grouped" work can be done faster than each operation separately.

So yes, it is natural that Pop() takes up the majority of your time!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top