Question

I really enjoy reading up on algorithms and data structures and in the course of doing so often come across ones that rely on $k$-wise independent families of hash functions. I'm perfectly comfortable working with these families of hash functions as black boxes (that is, I understand the relevant definitions and how the algorithms and data structures rely on those definitions), but when I look at specific examples of families of $k$-wise independent families of hash functions, I often have a really hard time understanding why they work. I assume that a lot of this is due to the fact that I don't have a super solid grasp of number theory or finite fields.

One trend I've noticed is that to form a $k$-wise independent family of hash functions whose domain and codomain is a set $\{0, 1, 2, 3, \dots, p-1 \}$, where $p$ is a prime, you can choose a random $(k+1)$-degree polynomial by choosing $k+1$ random elements out of $\{0, 1, 2, \dots, p-1\}$, then to have the hash function be "evaluate the polynomial on the given input value." (I assume that this is equivalent to choosing a random $(k+1)$-degree polynomial over $\mathbb{F}_p$, though I may be mistaken about this.)

My intuition behind why this approach works is that, just like a degree-$(k+1)$ polynomial over the reals cannot be determined by $k$ or fewer points, a degree-$(k+1)$ polynomial over $\mathbb{F}_p$ cannot be determined by $k$ or fewer points. As a result, these polynomials work well as hash functions because, with random coefficients, if you see $k-1$ outputs of the hash function, it's impossible to predict any $k$th output of the hash function better than a random guess because you can't determine what the polynomial is.

My questions are the following:

  1. Is this a correct intuition for why these families of hash functions work correctly?
  2. I teach undergraduate courses on algorithms and data structures that do not have number theory or abstract algebra as prerequisites. If this intuition is correct, would it be inappropriate / misleading to use this intuition to explain these families of hash functions?

Thanks!

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top