C++ hashing: Open addressing and Chaining

https://stackoverflow.com/questions/10842866

12-06-2021
|

Question

For Chaining:

Can someone please explain this concept to me and provide me a theory example and a simple code one?

I get the idea of "Each table location points to a linked list (chain) of items that hash to this location", but I can't seem to illustrate what's actually going on.

Suppose we had h(x) (hashing function) = x/10 mod 5. Now to hash 12540, 51288, 90100, 41233, 54991, 45329, 14236, how would that look like?

And for open addressing (linear probing, quadratic probing, and probing for every R location), can someone explain that to me as well? I tried Googling around but I seem to get confused further.

Solution

Chaining is probably the most obvious form of hashing. The hash-table is actually an array of linked-lists that are initially empty. Items are inserted by adding a new node to the linked-list at the item's calculated table index. If a collision occurs then a new node is linked to the previous tail node of the linked-list. (Actually, an implementation may sort the items in the list but let's keep it simple). One advantage of this mode is that the hash-table can never become 'full', a disadvantage is that you jump around memory a lot and your CPU cache will hate you.

Open Addressing tries to take advantage of the fact that the hash-table is likely to be sparsely populated (large gaps between entries). The hash-table is an array of items. If a collision occurs, instead of adding the item to the end of the current item at that location, the algorithm searches for the next empty space in the hash-table. However this means that you cannot rely on the hashcode alone to see if an item is present, you must also compare the contents if the hashcode matches. The 'probing' is the strategy the algorithm follows when trying to find the next free slot. One issue is that the table can become full, i.e. no more empty slots. In this case the table will need to be resized and the hash function changed to take into account the new size. All existing items in the table must be reinserted too as their hash codes will no longer have the same values once the hash function is changed. This may take a while.

Here's a Java animation of a hash table.

OTHER TIPS

because you do mod 5, your table will have 5 locations

location 0: 90100

because the result of 90100/10 mod 5 is 0

for same reason, you have:

location 1: None

location 2: 45329

location 3: 51288->41233->14236

location 4: 12540->54991

you can check out more info on wikipedia

In open addressing we have to store element in table using any of the technique (load factor less than equal to one).

But in case of chaining the hash table only stores the head pointers of Linklist ,Therefore load factor can be greater than one.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow