Question

I have a problem where I need to assemble a Map whose eventual size is in the GBs (going on past 64GB) and I cannot assume that a user of the program will have this kind of monster machine hanging around. A nice solution would be to distribute this map across a number of machines to make a far more modest memory footprint per instance.

Does anyone know of a library/suite of tools which can perform this sharding? I do not care about replication or transactions; just spreading this memory requirement around.

Was it helpful?

Solution

I suggest that you start with hazelcast:

http://www.hazelcast.com/

It is open-source, and in my opinion it is very easy to work with, so it is the best framework for rapid prototyping.

As far as I as know, it performs faster than the commercial alternatives, so I wouldn't worry about performance either.
(I haven't formally benchmarked it myself)

OTHER TIPS

terracotta might be useful have a look here

http://www.terracotta.org/

its a clustered jvm will depend on how often you update the map i guess on how well it performs.

Must it be open source? If not, Oracle Coherence can do it.

You may be able to solve your problem by using a database instead, something like http://hsqldb.org/ may provide the functionality you need with the ability to write the data to disk rather than keeping the whole thing in memory.

I would definitely take a step back and ask yourself if a map is the right data structure for GBs of data.

Gigaspaces Datagrid sounds like the kind of thing you are looking for. (Not free though)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top