Question

A while ago I played around with the Kademlia (KAD) protocol. I understood how it worked and I got the idea that one might use it to create a distributed data store.

Anyway, there is one problem: In Kademlia for each data package there is a node that "owns" it. When the data is requested, it gets propagated to the next node, but is assigned a TTL. After that it is being deleted. The idea in Kademlia is that the "owner" node refreshes the data on the other nodes before the data expire there.

As far as I understood this leads to caching the data even if the "owner" node leaves the network - but only for a while. If the owner node never comes back, all the data that was copied from it to the other nodes will expire sooner or later, hence after a while the data will be gone.

While this is okay for a P2P network where people want to share files, it would not be so very fine for a distributed data store.

How could one deal with this?

Or - is there another P2P protocol similar to Kademlia which takes this in consideration? In my imagination, the "perfect" solution would be if there were always a number of N nodes which hold the replicated data. As soon as one of them leaves, the remaining N-1 nodes look for another one to push the data to, so that you again have N nodes.

Does such a protocol exist?

Was it helpful?

Solution

Are you interested in developing your own implementation of the protocol or use an existing solution?

If you want to play around with your own implementation I would suggest looking at Chord DHT which I think is good.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top