Question

I am looking for a reasonably well tested library+server to store a persistent distributed hash table.

I am hesistant to use SQL-based solutions as the data is highly document oriented, consisting of millions of ~64KB blobs with only a single index (computed by hash of said BLOB) - and needs to be able to be distributed for long term scaling prospects.

Due to expense and bandwidth considerations, external solutions such as S3 are not an option.

Something like CouchDB or Project Voldemort would be ideal - however there is a noticable lack of .NET bindings for both (PV can be IKVMC'd from Java - however has "issues".). Both key and value are byte arrays (key is 16 byte, the value is up to 2048KB averaging 64KB)

I have searched so far for some kind of .NET port of Dynamo, Chord and similar - however the majority of results appear to be purely in-memory caches and lack any form of persistence or replication.

Anyone got any ideas or suggestions?

Was it helpful?

Solution

Take a look at Ayende's Rhino DHT. Might be more inline with what you are looking for. The source can be acquired here.

OTHER TIPS

DryadLINQ or Hadoop.Net may help.

Hadoop.Net is dotnet version of Hadoop. More about Hadoop can be found here

I actually think you should consider SQL Server 2008. Store the data in a table with a varbinary(max) column, along with a column that contains the hash of that column. Index the hash, as you suggested.

You'll then be able to use the various distribution features of the product.

Consider MS Velocity.

Summary: “Velocity” is a distributed in-memory application cache platform for developing scalable, available, and high-performance applications. “Velocity” fuses memory across multiple computers to give a single unified cache view to applications. Applications can store any serializable CLR object without worrying about where the object gets stored. Scalability can be achieved by simply adding more computers on demand. “Velocity” also allows for copies of data to be stored across the cluster, thus protecting data against failures. “Velocity” can be configured to run as a service accessed over the network or can be run embedded with the distributed application.

You can try StorageEdge it has NCache technology at it's back-end so by using it you'll have the support of distributed cache which will boost SharePoint performance, reliability, scalability and optimize its storage at the same time.

Here is link to StorageEdge's homepage http://www.alachisoft.com/storageedge/ I hope it help :)

Rest for .NET you can always try NCache a big name in Distributed Caching you can find its details on http://www.alachisoft.com/ncache/

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top