Question

I am looking for a trie implementation for .net.

I am planning to use it as the index structure for my in-memory object pool. It need not be thread safe (as only one thread will be updating it) but should be able to cope with at least 20 million items gracefully and with constant performance.

The ones I found on the net seems to be sample code or toy projects. So, I am really looking for a production quality implementation. Commercial libraries are also OK, if available.

PS:I selected tries as it seems hash table implementations that I have seen use too much memory and tend to cause memory fragmentations as they are based on arrays. Any such container with O(1) lookup characteristics and benign memory usage characteristics for large number of items could also be OK.

Thank you,

Was it helpful?

Solution

In my personal opinion attempting to second-guess .Net's own memory management is not a practise I'd recommend. You simply can't exert the level of control over memory allocation that you can in a native scenario, but equally you shouldn't need to. I was obsessed by a desire to do this when I first moved from C++ (where I would regularly work with my own heaps and write memory-localisation routines etc), but it swiftly became apparent that I just didn't need to, nor could I.

For example, you could have an array of MyPooledObject at the bottom of your trie, but, if that is a reference type, then you've just got an array of references, where the actual memory for each is somewhere else - that you can't control (unless you adapt your own host for the runtime).

That leaves using a value-type instead - but these are simply not suitable for use in a pooled scenario, because custom value types should be immutable (I can say that safely without justifying it - just google 'immutable' and 'struct' targetting site:stackoverflow.com to see more) and therefore no good to be treated as reusable objects.

If you need an indexed collection of objects in .Net where each is recognisable with a hash-capable key, then use a Dictionary.

If you have too many objects to fit in memory then either:

1) Get more memory

2) Use a database and cache local segments of it

Or both: You could consider looking at AppFabric and its cache features, that way you can build a farm of machines dedicated to running in-memory caches of millions of objects. The cost of the hardware will probably be less than the cost of developing your own memory management solution for .Net :)

OTHER TIPS

Take a look at this library: TrieNet

using Gma.DataStructures.StringSearch;

...

var trie = new SuffixTrie<int>(3);

trie.Add("hello", 1);
trie.Add("world", 2);
trie.Add("hell", 3);

var result = trie.Retrieve("hel");
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top