Question

Why does the Kademlia Distributed Hash Table use UDP as its network transport protocol, even though it's unreliable?

Was it helpful?

Solution

The main reason is that you rapidly query many nodes that you have never established contact to before and possibly will never see again during a lookup.

Kademlia lookups are iterative, i.e. requests won't be forwarded. A forwarding DHT would be more suited to long-standing TCP connections.

I.e. a large chunk of the traffic consists a short-lived exchange of a request and response between nodes of a network potentially ranging in the millions. The overhead of rapidly establishing thousands of TCP connections would be prohibitive.

OTHER TIPS

Why UDP? Because, it is simple, effective and low-cost protocol. It does not guarantee delivery of the package and does not require to establish a constant connection. All these features make UDP fit for fast data delivery to multiple recipients. That's all you need to P2P-applications.

Сitation from Kademlia's Design Specification:

Kademlia's designers do not appear to have taken into consideration the use of IPv6 addresses or TCP/IP instead of UDP or the possibility of a Kademlia node having multiple IP addresses.

I must admit that I have not used this product, but researching it makes me think I can answer this.

It appears to be an eventually coherent system. It also appears to be a high performance system. Given this, udp would work. There is no handshake like there is for tcp so it's fast. There is also a correction mechanism so the possible corruption from the protocol are dealt with.

Our version of Kademlia (OpenKad) can uses either TCP or UDP.

Kademlia is a high level routing protocol and as such works the same with both transport level protocols. The lookup time in Kademlia deployments is not that good due to failures dropped packets and time outs. So performance is not the best answer.

I know this is likely to bring a lot of debate, but UDP is not specifically designed to be un-reliable, it just lacks features which make it reliable. From a higher level prospective (such as socket programming) UDP and TCP look and feel very similar, but are in fact not comparable. TCP is designed to handle most transport related issues out of the box, where UDP only appends a port header and checksome to the underlying IP packet, and that's pretty much it's full extent as a protocol.

Naturally, your able to build onto and extend both protocols. Extending UDP is usually not the right solution because TCP typically handles all you need in networking, but one of the few exceptions is when TCP's connection model is too limiting. This is the case in p2p networking because TCP is designed to basically emulate an exclusive one-to-one pipe between two end points, where p2p connections typically share more of an all-to-all nature.

To make a long story short, your going to be "rebuilding the wheel" either way at this point by either adding 'reliability' to UDP or by creating an all-to-all TCP implementation.

To answer the meat of your question (why UDP in Kademlia) I don't think that the spec actually explains why to UDP (or not to use TCP), therefor I don't think there is an authoritative answer, but my guess is that the authors are/were of the opinion that building onto UDP would add more flexibility than attempting to stretch TCP into a direction that it specifically wasn't designed for. In other words, adding onto a lib which lacks practically any form of features VS working around and/or stretching the existing features of another.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top