Question

I have two different lists of objects, one of them an IQueryable set (rolled up into an array) and the other a List set. Objects in both sets share a field called ID; each of the objects in the second set will match an object in the first set, but not necessarily vice versa. I need to be able to handle both groups (matched and unmatched). The size of both collections is between 300 and 350 objects in this case (for reference, the XML generated for the objects in the second set is usually no more than 7k, so think maybe half to two-thirds of that size for the actual memory used by each object in each set).

The way I have it currently set up is a for-loop that iterates through an array representation of the IQueryable set, using a LINQ statement to query the List set for the matching record. This takes too much time; I'm running a Core i7 with 10GB of RAM and it's taking anywhere from 10 seconds to 2.5 minutes to match and compare the objects. Task Manager doesn't show any huge memory usage--a shade under 25MB. None of my system threads are being taxed either.

Is there a method or algorithm that would allow me to pair up the objects in each set one time and thus iterate through the pairs and unmatched objects at a faster pace? This set of objects is just a small subset of the 8000+ this program will have to chew through each day once it goes live...

EDIT: Here's the code I'm actually running...

        for (int i = 0; i < draftRecords.Count(); i++)
        {
            sRecord record = (from r in sRecords where r.id == draftRecords.ToArray()[i].ID select r).FirstOrDefault();
            if (record != null)
            { // Do stuff with the draftRecords element based on the rest of the content of the sRecord object
Was it helpful?

Solution

You should use a method such as Enumerable.Join or Enumerable.GroupJoin to match items from the two collections. This will be far faster than doing nested for loops.

Since you want to match a collection of keys to an item in the second list which may or may not exist, GroupJoin is likely more appropriate. This would look something like:

var results = firstSet.GroupJoin(secondSet, f => f.Id, s => s.Id, (f,sset) => new {First = f, Seconds = sset});

foreach(var match in results)
{
    Console.WriteLine("Item {0} matches:", match.First);
    foreach(var second in item.Seconds)
        Console.WriteLine("   {0}", second); // each second item matching, one at a time
}

OTHER TIPS

Your question is lacking in sample code/information but I would personally look to use methods like; Join, Intersect, or Contains. If necessary use Select to do a projection of the fields you want to match or define a custom IEqualityComparer.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top