Hash function, $h(k) = \lfloor km \rfloor$ is simple uniform for real $k$ independently, uniformly distributed in the range $0 \leq k < 1$

cs.stackexchange https://cs.stackexchange.com/questions/127682

Question

I was going through the text Introduction to Algorithms by Cormen et. al. where I came across the following statement:

If the keys are known to be random real numbers $k$ independently and uniformly distributed in the range $0 \leq k < 1$ , the hash function

$$h(k) = \lfloor km \rfloor$$ satisfies the condition of simple uniform hashing.

Now what I can understand that they are probably considering uniform disturbution in the "continuous" sense and not in the discrete sense. Had it been in the discrete sense then suppose for $n$ keys the probabity mass function (p.m.f) shall be constant and equal to $1/n$ and so it shall be equally likely for each key to be used in the hashing there-by yeilding the desired result.

But we seem sort of in trouble if the distribution being referred to is continuous (I feel so because of the line: "uniformly distributed in the range $0 \leq k < 1$")

Let $f(x)$ be the associated probability density function (p.d.f) and from the given information we have $f(x)=1$,(which is quite easily found, integrating $f(x)$ in the range $0$ to $1$ and equating it with $1$ and noting that in uniform distribution the p.d.f is a constant).

Now though the p.d.f is a constant but p.d.f is not the probability. Rather probability at a spectrum point is $0$. Now how to use this result to get to the claim of the authors.

Or am I entirely at fault considering the distribution to be continuous?

(There is an answer here, but it does not go into this detail as the question there is different after all).

Was it helpful?

Solution

$h\in [m]^U$ satisfies the simple uniform hashing assumption if when $x\in U$ is chosen uniformly at random, then $h(x)$ is uniformly distributed over $[m]$, or equivalently $\forall i\in[m]: \Pr\limits_{x\in U}[h(x)=i]=\frac{1}{m}$. In our case we have:

$\Pr[h(x)=i]=\Pr\big[\lfloor mx \rfloor=i\big]=\Pr[i\le mx < i+1]=\frac{i+1}{m}-\frac{i}{m}=\frac{1}{m}$.

We used the fact that if $x$ is uniformly distributed over $[0,1]$ then $\Pr[a\le x\le b]=b-a$ (the equality holds with all four combinations of $\le$ and $<$).

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top