Time complexity to fill hash table?

https://stackoverflow.com/questions/11277694

18-06-2021
|

Question

This is a homework question, but I think there's something missing from it. It asks:

Provide a sequence of m keys to fill a hash table implemented with linear probing, such that the time to fill it is minimum.

And then

Provide another sequence of m keys, but such that the time fill it is maximum. Repeat these two questions if the hash table implements quadratic probing

I can only assume that the hash table has size m, both because it's the only number given and because we have been using that letter to address a hash table size before when describing the load factor. But I can't think of any sequence to do the first without knowing the hash function that hashes the sequence into the table.

If it is a bad hash function, such that, for instance, it hashes every entry to the same index, then both the minimum and maximum time to fill it will take O(n) time, regardless of what the sequence looks like. And in the average case, where I assume the hash function is OK, how am I supposed to know how long it will take for that hash function to fill the table?

Aren't these questions linked to the hash function stronger than they are to the sequence that is hashed?

As for the second question, I can assume that, regardless of the hash function, a sequence of size m with the same key repeated m-times will provide the maximum time, because it will cause linear probing from the second entry on. I think that will take O(n) time. Is that correct?

Solution

Well, the idea behind these questions is to test your understanding of probing styles. For linear probing, if a collision occurs, you simply test the next cell. And it goes on like this until you find an available cell to store your data. Your hash table doesn't need to be size m but it needs to be at least size m.

First question is asking that if you have a perfect hash function, what is the complexity of populating the table. Perfect hashing function addresses each element without collision. So for each element in m, you need O(1) time. Total complexity is O(m).

Second question is asking for the case that hash(X)=cell(0), which all of the elements will search till the first empty cell(just rear of the currently populated table).

For the first element, you probe once -> O(1)

For the second element, you probe twice -> O(2)

for the nth element, you probe n times -> O(n)

overall you have m elements, so -> O(n*(n+1)/2)

For quadratic probing, you have the same strategy. The minimum case is the same, but the maximum case will have O(nlogn). ( I didn't solve it, just it's my educated guess.)

OTHER TIPS

This questions doesn't sound terribly concerned with the hash function, but it would be nice to have. You seem to pretty much get it, though. It sounds to me like the question is more concerned with "do you know what a worst-case list of keys would be?" than "do you know how to exploit bad hash functions?"

Obviously, if you come up with a sequence where all the entries hash to different locations, then you have O(1) insertions for O(m) time in total.

For what you are saying about hashing all the keys to the same location, each insertion should take O(n) if that's what you are suggesting. However, that's not the total time for inserting all the elements. Also, you might want to consider not literally using the same key over and over but rather using keys that would produce the same location in the table. I think, by convention, inserting the same key should cause a replacement, though I'm not 100% sure.

I'll apologize in advance if I gave too much information or left anything unclear. This question seems pretty cut-and-dried save the part about not actually knowing the hash function, and it was kind of hard to really say much without answering the whole question.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow