Question

I have std::unordered_map<int, int>. I do not want to use other structures like tree or anything else cause latency requirements. But at any time I need to know current max key and min key. How can I do that? Distribution is NOT uniform, instead max and min are frequently removed and inserted. So I need something smarter than "just scan entire map for a new max/min when current max/min is removed".

I do not want to use any other structures. I want to use std::unordered_map!

upd according to the answer created such structure:

struct OrderBookItem {
    int64_t price;
    int32_t lots;
};

typedef multi_index_container
    <OrderBookItem, indexed_by<

    hashed_unique<
    BOOST_MULTI_INDEX_MEMBER(OrderBookItem,int64_t,price)
    >,

    ordered_unique<
    BOOST_MULTI_INDEX_MEMBER(OrderBookItem,int64_t,price),
    std::greater<int64_t>
    >

    >> OrderBookContainer;
Was it helpful?

Solution

Your exact requirements cannot be met: std::unordered_map is fast (i.e. O(1) insertion/erasure/lookup) because it doesn't order its elements. This means that finding the minimum/maximum takes O(N).

The price of ordering paid by std::map that insertion/erasure/lookup (including finding the minimum/maximum) all become O(log N).

If you don't mind using Boost, you could take a look at the Boost.MultiIndex library. This allows you to access your data by multiple interfaces simultaneously. It is an extremely versatile and high-performance library that can also be used for MRU cache data structures (which also combine fast access and tracking of an ordering).

Your primary std::unordered_map interface is provided by the hashed_unique index, where you use an identity function object (provided by Boost.MultiIndex). Your secondary interface would mimic a std::set. This allows finding the minimum/maximum value in O(1) time as *begin() and *rbegin().

You can do that by adding a ordered_unique index with a user-defined MyOrder function object to compute whatever criterion you like to use to order them. Small code example (omitting the boost:: namespace everywhere)

using MyContainer = multi_index_container<
    Item,
    indexed_by<
        hashed_unique<identity<Item>>, // O(1) erasure/lookup, O(log N) insertion
        ordered_unique<MyOrder<Item>>  // track minimum/maximum in O(log N)
    >
>;

Behind the scenes, Boost.MultiIndex implements this roughly as a std::set<Value> with a std::unordered_map<Key, set<Value>::iterator>. This means both lookups and erasures are O(1). The erasure can be O(1) because the unordered_map::erase(value) returns an iterator into the set and std::set<Value>::erase(iterator) is O(1).

However, insertions remain O(log N), because you cannot avoid finding the rank of a newly inserted value within an ordered sequence in less time.

This improvement compared to std::map for lookup/erasure costs only a few pointers per element space overhead.

OTHER TIPS

You can write a wrapper. With that, you get every insertion/deletion and can retain min / max values.

#pragma once

#include <unordered_map>

using std::unordered_map;

class my_unordered_map
{
public:
    void insert(int key, int value)
    {
        _map[key] = value;
        if (_max < value)
            _max = value;
        if (_min> value)
            _min = value;
    }
    int get(int key)
    {
        auto it = _map.find(key);
        if (it != _map.end())
            return it->second;
        return 0; // or some error handling here.
    }
    int getMin()
    {
        return _min;
    }
    int getMax()
    {
        return _max;
    }
    // more functionality here.
private:
    unordered_map<int, int> _map;
    int _min = 0;
    int _max = 0;
};
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top