The re-sizing issue is not relevant as you already told an estimate of the number of elements in your question. So you can give a ConcurrentHashMap
an initial capacity large enough to avoid any rehashing.
The performance will not depend on the number of elements, that’s the main goal of hashing, but the number of concurrent threads.
The main problem is that you don’t have a plan of how to handle failed locks. Unless you want to poll until locking succeeds (which is not recommended) you need a way of putting a thread to sleep which implies that the thread currently owning the lock has to wake up a sleeping thread on release if one exists. So you end up requiring conventional Lock
features a ConcurrentHashMap
does not offer.
Creating a Lock
per element (as you said ~1,000,000) would not be a solution.
A solution would look a bit like the ConcurrentHashMap
works internally. Given a certain concurrency level, i.e. the number of threads you might have (rounded up), you create that number of Lock
s (which would be a far smaller number than 1,000,000).
Now you assign each element one of the Lock
s. A simple assignment would be based on the element’s hashCode
, assuming it is stable. Then locking an element means locking the assigned Lock
which gives you up to the configured concurrency level if all currently locked elements are mapped to different Lock
s.
This might imply that threads locking different elements block each other if the elements are mapped to the same Lock
, but with a predictable likelihood. You can try fine-tuning the concurrency level (as said, use a number higher than the number of threads) to find the best trade-off.
A big advantage of this approach is that you do not need to maintain a data structure that depends on the number of elements. Afaik, the new parallel ClassLoader
uses a similar technique.