문제

Situation: I have a List<IQueryable<MyDataStructure>>. I want to run a single linq query on each of them, in parallel, and then join the results.

Question: How to create a linq query which I can pass as a parameter?

Example code:

Here's some simplified code. First, I have the collection of IQueryable<string>:

    public List<IQueryable<string>> GetQueries()
    {
        var set1 = (new List<string> { "hello", "hey" }).AsQueryable();
        var set2 = (new List<string> { "cat", "dog", "house" }).AsQueryable();
        var set3 = (new List<string> { "cat", "dog", "house" }).AsQueryable();
        var set4 = (new List<string> { "hello", "hey" }).AsQueryable();

        var sets = new List<IQueryable<string>> { set1, set2, set3, set4 };

        return sets;
    }

I would like to find all the words which start with letter 'h'. With a single IQueryable<string> this is easy:

query.Where(x => x.StartsWith("h")).ToList()

But I want to run the same query against all the IQueryable<string> objects in parallel and then combine the results. Here's one way to do it:

        var result = new ConcurrentBag<string>();
        Parallel.ForEach(queries, query =>
        {
            var partOfResult = query.Where(x => x.StartsWith("h")).ToList();

            foreach (var word in partOfResult)
            {
                result.Add(word);
            }
        });

        Console.WriteLine(result.Count);

But I want this to be a more generic solution. So that I could define the linq operation separately and pass it as a parameter to a method. Something like this:

        var query = Where(x => x.FirstName.StartsWith("d") && x.IsRemoved == false)
            .Select(x => x.FirstName)
            .OrderBy(x => x.FirstName);

        var queries = GetQueries();

        var result = Run(queries, query);

But I'm at loss on how to do this. Any ideas?

도움이 되었습니까?

해결책

So the first thing that you wanted was a way of taking a sequence of queries, executing all of them, and then getting the flattened list of results. That's simple enough:

public static IEnumerable<T> Foo<T>(IEnumerable<IQueryable<T>> queries)
{
    return queries.AsParallel()
            .Select(query => query.ToList())
            .SelectMany(results => results);
}

For each query we execute it (call ToList on it) and it's done in parallel, thanks to AsParallel, and then the results are flattened into a single sequence through SelectMany.

The other thing that you wanted to do was to add a number of query operations to each query in a sequence of queries. This doesn't need to be parallelized (thanks to deferred execution, the calls to Where, OrderBy, etc. take almost no time) and can just be done through Select:

var queries = GetQueries().Select(query =>
    query.Where(x => x.FirstName.StartsWith("d")
        && !x.IsRemoved)
    .Select(x => x.FirstName)
    .OrderBy(x => x.FirstName));

var results = Foo(queries);

Personally I don't really see a need to combine these two methods. You can make a method that does both, but they're really rather separate concepts so I don't see a need for it. If you do want them combined though, here it is:

public static IEnumerable<TResult> Bar<TSource, TResult>(
    IEnumerable<IQueryable<TSource>> queries,
    Func<IQueryable<TSource>, IQueryable<TResult>> selector)
{

    return queries.Select(selector)
        .AsParallel()
        .Select(query => query.ToList())
        .SelectMany(results => results);
}

Feel free to make either Foo or Bar extension methods if you want. Also, you really better rename them to something better if you're going to use them.

다른 팁

First - given your current implementation, there is no reason to use IQueryable<T> - you could just use IEnumerable<T>.

You could then write a method which takes an IEnumerable<IEnumerable<T>> and a Func<IEnumerable<T>, IEnumerable<U>>, to build a result:

IEnumerable<IEnumerable<U>> QueryMultiple<T,U>(IEnumerable<IEnumerable<T>> inputs, Func<IEnumerable<T>,IEnumerable<U>> mapping)
{
     return inputs.AsParallel().Select(i => mapping(i));
}

You could then use this as:

void Run()
{
    IEnumerable<IEnumerable<YourType>> inputs = GetYourObjects();

    Func<IEnumerable<YourType>, IEnumerable<YourType>> query = i => 
       i.Where(x => x.FirstName.StartsWith("d") && x.IsRemoved == false)
        .Select(x => x.FirstName)
        .OrderBy(x => x.FirstName);

    var results = QueryMultiple(inputs, query);
}
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top