Efficieny of stl::map of stl::sets

https://stackoverflow.com/questions/18581957

27-06-2022
|

Domanda

I believe I'd like to use boost::icl::interval_map to solve a problem (described here, I'll post a complete answer if interval_maps ultimately work.)

I want to use an interval_map<unsigned long long, set<foo*>>, but the documentation for boost::icl mentions that there are potential efficiency problems (below from).

We are introducing interval_maps using an interval map of sets of strings, because of it's didactic advantages. The party example is used to give an immediate access to the basic ideas of interval maps and aggregate on overlap. For real world applications, an interval_map of sets is not necessarily recommended. It has the same efficiency problems as a std::map of std::sets. There is a big realm though of using interval_maps with numerical and other efficient data types for the associated values.

What are the efficiency issues with std::map of std::sets? and How can I avoid them?

Soluzione

Both std::map<K, V> and std::set<V> are node based containers linked by pointers. Traversing them has nice complexity guarantees (i.e., each individual operation is at most O(log n)) but you actually need fairly sizable containers for the complexity to matter compared, e.g., to a std::vector<std::pair<K, V>> especially when K and V are fundamental types. The main performance issue with node based containers is that they are laid out more or less randomly in memory while contemporary CPUs like to access data which is clustered in some form.

Of course, as usual you'll need to measure the times obtained between different implementations on fairly realistic data sets to determine which data structure yields the best performance.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow