Question

If I have a key set of 1000, what is a suitable size for my Hash table, and how is that determined?

Was it helpful?

Solution

It depends on the load factor (the "percent full" point where the table will increase its size and re-distribute its elements). If you know you have exactly 1000 entries, and that number will never change, you can just set the load factor to 1.0 and the initial size to 1000 for maximum efficiency. If you weren't sure of the exact size, you could leave the load factor at its default of 0.75 and set your initial size to 1334 (expected size/LF) for really good performance, at a cost of extra memory.

You can use the following constructor to set the load factor:

Hashtable(int initialCapacity, float loadFactor) 

OTHER TIPS

You need to factor in the hash function as well.

one rule of thumb suggests make the table size about double, so that there is room to expand, and hopefully keep the number of collisions small.

Another rule of thumb is to assume that you are doing some sort of modulo related hashing, then round your table size up to the next largest prime number, and use that prime number as the modulo value.

What kind of things are you hashing? More detail should generate better advice.

There's some discussion of these factors in the documentation for Hashtable

Let it grow. With this size, the automatic handling is fine. Other than that, 2 x size + 1 is a simple formula. Prime numbers are also kind of good, but as soon as your data set reaches a certain size, the hash implementation might decide to rehash and grow the table.

Your keys are driving the effectiveness and are hopefully distinct enough.

Bottom line: Ask the size question when you have problems such as size or slow performance, other than that: Do not worry!

Twice is good.

You don't have a big keyset. Don't bother about difficult discussions about your HashTable implementation, and go for 2000.

I'd like to reiterate what https://stackoverflow.com/users/33229/wwwflickrcomphotosrene-germany said above. 1000 doesn't seem like a very big hash to me. I've been using a lot of hashtables about that size in java without seeing much in the way of performance problems. And I hardly ever muck about with the size or load factor.

If you've run a profiler on your code and determined that the hashtable is your problem, then by all means start tweaking. Otherwise, I wouldn't assume you've got a problem until you're sure.

After all, in most code, the performance problem isn't where you think it is. I try not to anticipate.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top