Question

I get data from external application:

class DataItem
{
   public string key;
   public int Attribute1;
   public string Attribute2;
}

One thread store it in collection. Other threads (3-10) query collection by key (90%) and attributes (10%).

What is the best way to implement this If I have 10, 100, 1000+ items in collection?

Was it helpful?

Solution

If you are really wanting an in-memory database then Sqlite using the managed data provider would be your best option. However, I suspect in this case you would be okay with the ConcurrenctDictionary. This collection could easily handle 1000+ items and many threads accessing it in parallel. The caveat with using this collection is that you can specify only one key for each entry in the collection. You may need to use separate collections for each attribute you want to lookup. Then again, if lookups by an attribute are infrequent enough then you could opt for enumerating the entire collection to find matching attributes without the need for separate collections.

OTHER TIPS

If the collection is immutable (read-only, never changing) after initialization, and the collection is initialized before any threads can get to it, you don't need to do anything special. Multiple threads can read from a collection or dictionary concurrently without any problems.

Problems only arise when the shared object (collection) changes state as a result of actions by multiple threads. Updating the collection while multiple threads are reading from it, or if the collection maintained internal cache lists or whatnot would create a problem for multithread access.

You don't even need explicit locks to protect the collection during intialization, if you set up the collection as a static object initialized in its static constructor. .NET will guarantee that the class is initialized before first use.

You can save yourself a lot of headaches and work if you can redefine the problem so that the collection is immutable after initialization.

Is the in-memory collection intended to be read only? It will make a difference in what you end up using.

My recommendations -
Read only: use ConcurrentDictionary
Read & Write: use DataSet

The best concurrent, or Thread-Safe, model, in my opinion, would be the DataSet - see: ADO.Net Tackle Data Concurrency and MSDN DataSet. The DataSet was developed to handle in-memory data storage for multiple clients. NOTE what MSDN says:

This type is safe for multithreaded read operations. You must synchronize any write operations.

You do have an alternative to a DataSet, as Brian Gideon suggests - a ConcurrentDictionary.

With a DataReader, you can fill custom objects, like DataItem, directly from the DataReader.

Either way, both of these solutions will allow you quick and concurrent in-memory data access.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top