Trove4j doesn't contain hashmap for string-to-string.
See http://trove4j.sourceforge.net/javadocs/gnu/trove/map/hash/package-summary.html
Question
I have a pretty large google Multimap<String,String>
and was looking into ways to reduce the memory usage. In all of the examples I can find people are doing something like:
Multimaps.newSetMultimap(
TDecorators.wrap(new TIntObjectHashMap<Collection<Integer>>()),
new Supplier<Set<Integer>>() {
public Set<Integer> get() {
return TDecorators.wrap(new TIntHashSet());
}
});
which works for a Multimap <Integer,Integer>
, is it possible to use Trove to wrap a <String,String>
?
Incase anyone is interested in the future I went with http://code.google.com/p/jdbm2/ to write the hash map to the filesystem.
Solution 3
Trove4j doesn't contain hashmap for string-to-string.
See http://trove4j.sourceforge.net/javadocs/gnu/trove/map/hash/package-summary.html
OTHER TIPS
Guava's Multimaps are backed by standard JDK Collections which aren't optimized for memory usage. For example, ArrayListMultimap<K, V>
is backed by HashMap<K, ArrayList<V>>
and HashMultimap<K, V>
is backed by HashMap<K, HashSet<V>>
.
Eclipse Collections (formerly GS Collections) has Multimaps backed by its own container types, UnifiedMap
and UnifiedSet
. UnifiedMap
uses half the memory of HashMap
and UnifiedSet
uses a quarter the memory of HashSet
. The memory benefits you'll see will depend on whether you use a FastListMultimap
or a UnifiedSetMultimap
.
More detailed memory comparisons are available here.
Note: I am a committer for Eclipse Collections.
You could look at memory efficient variant of hash maps, such as this one: https://code.google.com/p/sparsehash/
If your value strings are long enough, compression could be an option. You could also look into disk backed solutions such as Ehcache, depending on your access statistics.
An approach I use is to use Map<String,Collection<String>>
where the values start out as ArrayList<String>
and get promoted to HashSet<String>
when the bucket hits some threshold, say 32 elements.
I have found this saves a lot of memory for small buckets.