As your numbers show, the cache does not strictly guarantee a maximum but does attempt to maintain that high watermark. If it did make a strong guarantee, then it would limit concurrency by having a blocking call for every write operations and not be able to efficiently maintain a top-level LRU policy. In Guava's Cache
we do make that restriction by using lock striping, but that trade-off comes with limitations.
As CLHM asynchronously connects the hash table and LRU data structures the two numbers can deviate.
The size
is measured by the decorated ConcurrentHashMap
. This is the most accurate estimate of key-value pairs at any given instant. The weighted size
is calculated by the LRU policy when it replays the map operations against its data structures to determine the recency ordering and perform evictions. The weighted size stays at the expected maximum because the policy will evict when the threshold has been exceeded.
It is expected that the size differs in bursts, but the cache will always try to self-correct quickly. In your example the behavior is exacerbated because the operating system will schedule a thread for a large time slice. This allows it to perform many more insertions than possible in a real-world scenario. The amortized LRU management means that eviction cannot keep up since its stealing a bit of time from one thread at any given moment. To better simulate actual behavior, the MemoryLeakTest forces a context switch between operations.
If you implement an eviction listener, you should see the count increase and that you have reached a steady-state. If you force a context switch, you should see the size kept at a lower bound. If you want a stricter limit on the total possible number entries ever in the map, then prefer the Guava implementation.