Question

I know that from an algorithm analysis theoretic perspective searching in an unordered container is constant amortized time complexity O(1) whereas in an ordered one, time complexity is O(log(n)) etc. However, in reality the qualitative and quantitative criteria that would compel us to use unordered containers instead of ordered ones depend on quite a few factors.

What are the factors that determine that usage of unordered containers is more appealing than usage of their respective ordered ones?

Was it helpful?

Solution

Assume that you have both a comparator and a hash function, and therefore you have a free choice between the two. Assume also that you can avoid any single-element operations that are linear in one container but not the other (John Zwinck in a comment makes the excellent point that unordered erase is documented worst-case linear in the size of the container but in practice has been implemented slower even than that).

Then, the main criterion is whether you need to iterate the container in sorted order. If so, then you would expect to use the ordered container. If not, then you would expect to use the unordered container.

As a secondary possibility, the interfaces of the two are sufficiently similar that you might easily test performance of your actual application with both, and choose the faster. It's not a difficult coding task to iterate an unordered container in sorted order, provided that it won't be modified while you're doing it.

Of course there are pitfalls associated with profiling -- your tests might fail to use realistic data. And even if they're OK, your users might fail to use realistic data ;-) For most cases there's nothing better. For specific situations like real-time guarantees, you need to know a lot more about the implementation you're using than the standard tells you (and for that reason you might not use standard containers at all).

The reason I say that performance testing is secondary in this case is just that you could, if you had unlimited development time, profile every possible different way to write your program and pick the best. Since you don't have unlimited development time you primarily choose how to write your code using heuristics. Having chosen something that works and mostly isn't awful, you profile to identify hot code and then choose between plausible-seeming options for that code.

OTHER TIPS

Herb Sutter covered this in his talk Modern C++: What You Need to Know https://www.youtube.com/watch?v=TJHgp1ugKGM

In summary use std::vector, test and profile. But std::vector usually wins by a few orders of magnitude.

If you don't need the sequence to be ordered, use unordered containers: they are normally a lot faster. Of course, it somewhat depends on how the hash and equality operators compared to the relational operator but it assuming these are reasonable you'll get much faster operations with unordered containers. As usual: you should probably profile your specific use case(s).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top