Question

I currently have a big cluster (50+ computers) processing work packets. All these work-packages results are then collected by a controller computer.

So, suppose the data arrives in a semi random manner, and progressively growing:

0,2,3,8,7,1,4,9,10,11,6, ...

Then I have to dump to a file the data in order. And since the work-packages are 100M+, I dump them as soon as I have the next one ready in my buffer.

I am actually using a std::map for this "Random-Input-First-Output", example:

std::map<int, DataStruct> buffer;

long long next_work = 0;
while(next_work != 100000000){
    fill_buffer(buffer); //This method fills the buffer from the processing elements

    if(buffer.find(next_work) != buffer.end()){
        dump_to_file(buffer[next_work]); // Dumps it to file
        buffer.erase(next_work++);
    }
};

This buffer sometimes (due to network lag, PC lag, disconnects, etc...) it grows to 100000+ elements, which slows down the operation. So, I wonder myself if STL has a better class for this type of operations.

Is the use of std::map appropiate here?

Is there a better/CPU-efficient approach? (I don't really care if it is memory efficient)

Was it helpful?

Solution

Since there is no job less than next_work in your buffer, your map can only contain such an element as its very first element. You could instead use begin() to access your iterator, gaining the first element in constant time (and then test if it is the one you want). But what's still slowing down is the insertion, which will still be logarithmic in the current size of the buffer.

An alternative associative data-structure is a hash map. It supports expected constant-time insertion and look-up. It's not sorted by key, but since you only need to look-up an exact key this is not going to be a problem.

If your compiler supports C++11 (and you don't need to support pre-C++11 compilers), there is std::unordered_map. Probably just replacing the class will solve it all. Maybe you need to adjust other accesses to the data structure which you didn't show in the code snipped, i.e. we don't know how you fill your data structure in fill_buffer.

For C++03, you could try boost::unordered_map.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top