Question

I'm using a Dictionary<> to store a bazillion items. Is it safe to assume that as long as the server's memory has enough space to accommodate these bazillion items that I'll get near O(1) retrieval of items from it? What should I know about using a generic Dictionary as huge cache when performance is important?

EDIT: I shouldn't rely on the default implementations? What makes for a good hashing function?

Was it helpful?

Solution

It depends, just about entirely, on how good a hashing functionality your "bazillion items" support -- if their hashing function is not excellent (so that many conflicts result) your performance will degrade with the growth of the dictionary.

OTHER TIPS

You should measure it and find out. You're the one who has knowledge of the exact usage of your dictionary, so you're the one who can measure it to see if it meets your needs.

A word of advice: I have in the past done performance analysis on large dictionary structures, and discovered that performance did degrade as the dictionary became extremely large. But it seemed to degrade here and there, not consistently on each operation. I did a lot of work trying to analyze the hash algorithms, etc, before smacking myself in the forehead. The garbage collector was getting slower because I had so much live working set; the dictionary was just as fast as it always was, but if a collection happened to be triggered, then that was eating up my cycles.

That's why it is important to not do performance testing in unrealistic benchmark scenarios; to find out what the real-world performance cost of your bazillion-item dictionary is, well, that's going to be gated on lots of stuff that has nothing to do with your dictionary, like how much collection triggering is happening throughout the rest of your program, and when.

Yes you will have O(1) access times. In fact to be pedantic g it will be exactly O(1). You need to ensure that all your objects that are used as keys have a good GetHashCode implementation and should likely override Equals.

Edit to clarify: In reality acess times will get slower the more items you have unless you can provide a "perfect" hash function.

Yes, you will have near O(1) no matter how many objects you put into the Dictionary. But for the Dictionary to be fast, your key-objects should provide a sufficient GetHashCode-implementation, because Dictionary uses a hashtable inside.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top