Question

I have a huge Collection (which I can cast as an enumerable using OfType<>()) of objects. Each of these objects has a Category property, which is drawn from a list somewhere else in the application. This Collection can reach sizes of hundreds of items, but it is possible that only, say, 6/30 of the possible Categories are actually used. What is the fastest method to find these 6 Categories? The size of the huge Collection discourages me from just iterating across the entire thing and returning all unique values, so is there a faster method of accomplishing this?

Ideally I'd collect the categories into a List<string>.

Was it helpful?

Solution

If you are using .NET 3.5 then try this:

List<string> categories = collection
    .Cast<Foo>()
    .Select(foo => foo.Category)
    .Distinct()
    .ToList();

It should be very fast.

I assume these objects originally came from a database? If so then you might want to ask the database to do the work for you. If there is an index on that column then you will get the result close to instantly without even having to fetch the objects into memory.

OTHER TIPS

The size of the huge Collection discourages me from just iterating across the entire thing and returning all unique values

I am afraid in order to find all used categories, you will have to look at each item once, so you can hardly avoid iterating (unless you keep track of the used categories while building your collection).

Try if Mark Byers solution is fast enough for you and only worry about its performance if it isn't.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top