Frage

I've constructed a Trie with about 400,000 nodes from a dictionary having about 180,000 words. The problem is that the trie is taking too long to construct on my phone.

So I've decided to create the trie once and store it in some format to the disk, for fast recreation whenever needed. But I'm unable to come up with a good format to store it.

What would be the most efficient format to encode the trie, so that it can be reconstructed from the file as fast as possible?

War es hilfreich?

Lösung

If your Trie data structure implements serializable then writing to and from a file should be fairly straight forward. Java will take care of the file representation.

See this link.

Andere Tipps

Maybe good idea - to keep tried in the memory buffer in the position-independent code, and read it into memory by mmap(). This is mostly quick way to work with trie from "cold start".

Also, maybe you can keep data not in the tries, but in the hashtable. By this method, you can keep in the memory only "bucket index", which is very small. And, when compute hash - pread() bucket into memory from file, and search in loaded part.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top