Question

Does anybody have an implementation of Cuckoo hashing in C? If there was an Open Source, non GPL version it would be perfect!

Since Adam mentioned it in his comment, anyone knows why it is not much used? Is it just a matter of implementation or the good theoretical properties do not materialize in practice?

Was it helpful?

OTHER TIPS

As other answers have pointed out, it's true that the simplest cuckoo hashtable requires that the table be half empty. However, the concept has been generalized to d-ary cuckoo hashing, in which each key has d possible places to nest, as opposed to 2 places in the simple version.

The acceptable load factor increases quickly as d is increased. For only d=3, you can already use around a 75% full table. The downside is that you need d independent hash functions. I'm a fan of Bob Jenkins' hash functions for this purpose (see http://burtleburtle.net/bob/c/lookup3.c), which you might find useful in a cuckoo hashing implementation.

Cuckoo hashing is relatively unused outside of academia (aside from hardware caches, which sometimes borrow ideas from, but don't really implement fully). It requires a very sparse hash table to get good time on insertions - you really need to have 51% of your table empty for good performance. So it is either fast and takes a lot of space, or slow and uses space efficiently - never both. Other algorithms are both time and space efficient, although they are worse than cuckoo when only time or space is taken into account.

Here is a code generator for cuckoo hash tables. Check the license of the generator to verify that the output is non GPL. It should be, but check anyway.

-Adam

Even though it's an old question, someone might still be interested :)

This paper describes the implementation of a parallel d-ary cuckoo hash on GPUs (CUDA/OpenCL). It's described very well and implementing it based on the description is quite easy. Generally worth reading, if you're interested in this topic. (You'll need an ACM login though.)

The IO language has one, in PHash.c. You can find the code for IO on Github. IO is BSD licensed.

I see the point on utilization but this was my reasoning for trying this particular hashing scheme. Please ket me know if I missed something.

To my knowledge, possible alternatives to hashtables to create a dynamic dictionary are (balanced) binary trees and skiplists. Just for discussion let's abstract from the key and value types and let's assume that we will access values through a void *.

For a binary tree I would have:

struct node {
  void *key;
  void *value;
  struct node *left;
  struct node *right;
}

So, assuming pointers have all the same size s, to store n items I will need 4 s bytes.

Skiplists are almost the same as the average number of pointers in a node is 2.

In an hashtable I would have:

struct slot {
  void *key;
  void *value;
}

So, each item will only requre 2 s bytes to be stored. If the load factor is 50%, to store n items I will need the same 4 s bytes as trees.

It doesn't seem too bad to me: the cuckoo hashtable will occupy more or less the same amount of memory as a binary tree but will give me O(1) access time rather than O(log n).

Not counting the complexity of keeping the tree balanced and the additional info that could be required to store balancing information in the node.

Other hashing schemes could achieve a better load factor (say 75% or 80%) with no guarantee on the worst case access time (that could even be O(n) ).

By the way, d-ary cuckoo hashing and "cuckoo hashing with a stash" seem to be able to increase the load factor while still keeping constant access time.

Cuckoo hashing seems a valuable technique to me and I thought it was already explored; that's the reason of my question.

I can't speak for software but cuckoo hashing is certainly used in hardware and becoming very popular. Major vendors of networking equipment have been looking into cuckoo hashing and some already use it. The attraction to cuckoo hashing comes from the constant lookup time, of course, but also the near constant insertion time.

Although insertion can theoretically be unbounded, in practice it can be bounded to O(log n) of the number of rows in the table(s) and when measured, the insertion time is about 1.1*d memory accesses on average. That's just 10% more than the absolute minimum! Memory access is often the limiting factor in networking equipment.

Independent hash functions are a must and selecting them properly is difficult. Good luck.

Following a comment from "onebyone", I've implemented and tested a couple of versions of Cuckoo hashing to determine the real memory requirement.

After some experiment, the claim that you don't have to reash until the table is almost 50% full seems to be true, especially if the "stash" trick is implmented.

The problem is when you enlarge the table. The usual approach is to double its size but this leads to the new table being only 25% utilized!

In fact, assume the hashtable has 16 slots, when I insert the 8th element number, I'll run out of good slots and will have to reash. I'll double it and now the table is 32 slots with only 8 of them occupied which is a 75% waste!

This is the price to pay to have a "constant" retrieval time (in terms of upper bound for the number of access/comparison).

I've devised a different schema, though: starting from a power of 2 greater than 1, if the table has n slots and n is a power of two, add n/2 slots otherwhise add n/3 slots:

+--+--+
|  |  |                             2 slots
+--+--+

+--+--+--+
|  |  |  |                          3 slots
+--+--+--+ 

+--+--+--+--+
|  |  |  |  |                       4 slots
+--+--+--+--+

+--+--+--+--+--+--+
|  |  |  |  |  |  |                 6 slots
+--+--+--+--+--+--+

+--+--+--+--+--+--+--+--+
|  |  |  |  |  |  |  |  |           8 slots
+--+--+--+--+--+--+--+--+

etc.

Together with the assumption that reashing will only occur when the table is 50% full, this leads to the fact that the table will only be 66% empty (1/3rd) rather than 75% empty (1/4th) after a reash (i.e. the worst case).

I've also figured out (but I still need to check the math) that enlarging each time by sqrt(n), the wasted space asymptotically approaches 50%.

Of course the price to pay for less memory consumption is the increase of the number of reash that will be needed in the end. Alas, nothing comes for free.

I'm going to investigate further if anyone is interested.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top