Question

I am about to index 10 million titles with their IDs(for now their line numbers), titles will be stored after tokenising them. The structure of the data has to be something like <String, Arraylist<Integer>>. Strings will represent the tokens, Integers will represent line numbers.

I have to build this tool using: Java, persistent memory, not using RDBMS as possible. As this data structure is mutable, I couldn't find any tools that support MultiMaps with the structure > to be indexed using BTree or any other persistent data structures.

I tried MapDB, but turned to only accept immutable, which in my case doesn't apply (Arraylist)

Any thoughts are appreciated.

Était-ce utile?

La solution

What you need is called MultiMap. MapDB does not support those directly, but has composite sets which are almost as good.

Example is here: https://github.com/jankotek/MapDB/blob/release-1.0/src/test/java/examples/MultiMap.java

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top