Question

public class User
{
   public int Id { get; set; }
   public int Age { get; set; }
   public string Name { get; set; }
}

I have 100k users.

Query: Get Users Whose Name is "Rafael" AND whose age is between 40 and 50

  • By Linq2Objects : users.Where(p=>p. Name=="Rafael" && p.Age>=40 && p.Age<=50).ToArray();

Is there any alternative implemantation with better performance? (Readonly Thread-Safe)

(MultiIndexed User Array)

I've tested it's performance. For 1000k users it takes 30-50 ms. It seems not important but it is. Because I can get 50 requests in a second.

With dharnitski's solution. It takes 0ms. :)

But is there any code framework makes it transparently.

public  class   FastArray<T> 
Was it helpful?

Solution

You cannot get result you want without full dataset scan if your data is not prepared. Prepare data in advance when time is not critical and work with sorted data when you need short response time.

There is an analogy for this in database world.

There is a table with 100K records. Somebody wants to run a Select query with "where" clause that filter data by not primary key. It always will be slow "table scan" operation in execution plan unless index(es) is implemented.

Sample of code that implements indexing using ILookup<TKey, TValue>:

//not sorted array of users - raw data
User[] originalUsers;
//Prepare data in advance (create one index). 
//Field with the best distribution should be used as key
ILookup<string, User> preparedUsers = originalUsers.ToLookup(u => u.Name, u => u);


//run this code when you need subset 
//search by key is optimized by .NET class 
//"where" clause works with small set of data
preparedUsers["Rafael"].Where(p=> p.Age>=40 && p.Age<=50).ToArray();

This code is not as powerful as database indexes (for example it does not support substrings) but it shows the idea.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top