Question

I need to do some filtering on an ObjectSet to obtain the entities I need by doing this :

query = this.ObjectSet.Where(x => x.TypeId == 3); // this is just an example;

Later in the code (and before launching the deferred execution) I filter the query again like this :

query = query.Where(<another lambda here ...>);

That works quite well so far.

Here is my problem :

The entities contains a DateFrom property and a DateTo property, which are both DataTime types. They represent a period of time.

I need to filter the entities to get only those that are part of a collection of periods of time. The periods in the collection are not necessarily contiguous, so, the logic to retreive the entities looks like that :

entities.Where(x => x.DateFrom >= Period1.DateFrom and x.DateTo <= Period1.DateTo)
||
entities.Where(x => x.DateFrom >= Period2.DateFrom and x.DateTo <= Period2.DateTo)
||

... and on and on for all the periods in the collection.

I have tried doing that :

foreach (var ratePeriod in ratePeriods)
{
    var period = ratePeriod;

    query = query.Where(de =>
        de.Date >= period.DateFrom && de.Date <= period.DateTo);
}

But once I launch the deferred execution, it translates this into SQL just like I want it (one filter for each of the periods of time for as many periods there is in the collection), BUT, it translates to AND comparisons instead of OR comparisons, which returns no entities at all, since an entity cannot be part of more than one period of time, obviously.

I need to build some sort of dynamic linq here to aggregate the period filters.


Update

Based on hatten's answer, I've added the following member :

private Expression<Func<T, bool>> CombineWithOr<T>(Expression<Func<T, bool>> firstExpression, Expression<Func<T, bool>> secondExpression)
{
    // Create a parameter to use for both of the expression bodies.
    var parameter = Expression.Parameter(typeof(T), "x");
    // Invoke each expression with the new parameter, and combine the expression bodies with OR.
    var resultBody = Expression.Or(Expression.Invoke(firstExpression, parameter), Expression.Invoke(secondExpression, parameter));
    // Combine the parameter with the resulting expression body to create a new lambda expression.
    return Expression.Lambda<Func<T, bool>>(resultBody, parameter);
}

Declared a new CombineWithOr Expression :

Expression<Func<DocumentEntry, bool>> resultExpression = n => false;

And used it in my period collection iteration like this :

foreach (var ratePeriod in ratePeriods)
{
    var period = ratePeriod;
    Expression<Func<DocumentEntry, bool>> expression = de => de.Date >= period.DateFrom && de.Date <= period.DateTo;
    resultExpression = this.CombineWithOr(resultExpression, expression);
}

var documentEntries = query.Where(resultExpression.Compile()).ToList();

I looked at the resulting SQL and it's like the Expression has no effect at all. The resulting SQL returns the previously programmed filters but not the combined filters. Why ?


Update 2

I wanted to give feO2x's suggestion a try, so I have rewritten my filter query like this :

query = query.AsEnumerable()
    .Where(de => ratePeriods
        .Any(rp => rp.DateFrom <= de.Date && rp.DateTo >= de.Date))

As you can see, I added AsEnumerable() but the compiler gave me an error that it cannot convert the IEnumerable back to IQueryable, so I have added ToQueryable() at the end of my query :

query = query.AsEnumerable()
    .Where(de => ratePeriods
        .Any(rp => rp.DateFrom <= de.Date && rp.DateTo >= de.Date))
            .ToQueryable();

Everything works fine. I can compile the code and launch this query. However, it doesn't fit my needs.

While profiling the resulting SQL, I can see that the filtering is not part of the SQL query because it filters the dates in-memory during the process. I guess that you already know about that and that is what you intended to suggest.

Your suggestion works, BUT, since it fetches all the entities from the database (and there are thousands and thousands of them) before filtering them in-memory, it's really slow to get back that huge amount from the database.

What I really want is to send the period filtering as part of the resulting SQL query, so it won't return a huge amount of entities before finishing up with the filtering process.

Was it helpful?

Solution

Despite the good suggestions, I had to go with the LinqKit one. One of the reasons is that I will have to repeat the same kind of predicate aggregation in many other places in the code. Using LinqKit is the easiest one, not to mention I can get it done by writing only a few lines of code.

Here is how I solved my problem using LinqKit :

var predicate = PredicateBuilder.False<Document>();
foreach (var submittedPeriod in submittedPeriods)
{
    var period = period;
    predicate = predicate.Or(d =>
        d.Date >= period.DateFrom && d.Date <= period.DateTo);
}

And I launch deferred execution (note that I call AsExpandable() just before) :

var documents = this.ObjectSet.AsExpandable().Where(predicate).ToList();

I looked at the resulting SQL and it does a good job at translating my predicates into SQL.

OTHER TIPS

You can use a method like the following:

Expression<Func<T, bool>> CombineWithOr<T>(Expression<Func<T, bool>> firstExpression, Expression<Func<T, bool>> secondExpression)
{
    // Create a parameter to use for both of the expression bodies.
    var parameter = Expression.Parameter(typeof(T), "x");
    // Invoke each expression with the new parameter, and combine the expression bodies with OR.
    var resultBody = Expression.Or(Expression.Invoke(firstExpression, parameter), Expression.Invoke(secondExpression, parameter));
    // Combine the parameter with the resulting expression body to create a new lambda expression.
    return Expression.Lambda<Func<T, bool>>(resultBody, parameter);
}

And then:

Expression<Func<T, bool>> resultExpression = n => false; // Always false, so that it won't affect the OR.
foreach (var ratePeriod in ratePeriods)
{
    var period = ratePeriod;
    Expression<Func<T, bool>> expression = (de => de.Date >= period.DateFrom && de.Date <= period.DateTo);
    resultExpression = CombineWithOr(resultExpression, expression);
}

// Don't forget to compile the expression in the end.
query = query.Where(resultExpression.Compile());

For more information, you may want to check out the following:

Combining two expressions (Expression<Func<T, bool>>)

http://www.albahari.com/nutshell/predicatebuilder.aspx

Edit: The line Expression<Func<DocumentEntry, bool>> resultExpression = n => false; is just a placeholder. CombineWithOr method needs two methods to combine, if you write Expression<Func<DocumentEntry, bool>> resultExpression;', you can't use it in the call toCombineWithOrfor the first time in yourforeach` loop. It's just like the following code:

int resultOfMultiplications = 1;
for (int i = 0; i < 10; i++)
    resultOfMultiplications = resultOfMultiplications * i;

If there's nothing in resultOfMultiplications to begin with, you cannot use it in your loop.

As to why the lambda is n => false. Because it doesn't have any effect in an OR statement. For example, false OR someExpression OR someExpression is equal to someExpression OR someExpression. That false doesn't have any effect.

How about this code:

var targets = query.Where(de => 
    ratePeriods.Any(period => 
        de.Date >= period.DateFrom && de.Date <= period.DateTo));

I use the LINQ Any operator to determine if there is any rate period that conforms to de.Date. Although I'm not quite sure how this is translated into efficient SQL statements by entity. If you could post the resulting SQL, that would be quite interesting for me.

Hope this helps.

UPDATE after hattenn's answer:

I don't think that hattenn's solution would work, because Entity Framework uses LINQ expressions to produce the SQL or DML that is executed against the database. Therefore, Entity Framework relies on the IQueryable<T> interface rather than IEnumerable<T>. Now the default LINQ operators (like Where, Any, OrderBy, FirstOrDefault and so on) are implemented on both interfaces, thus the difference is sometimes hard to see. The main difference of these interfaces is that in case of the IEnumerable<T> extension methods, the returned enumerables are continuously updated without side effects, while in the case of IQueryable<T> the actual expression is recomposed, which is not free of side effects (i.e. you are altering the expression tree that is finally used to create the SQL query).

Now Entity Framework supports the ca. 50 standard query operators of LINQ, but if you write your own methods that manipulate an IQueryable<T> (like hatenn's method), this would result in an expression tree that Entity Framework might not be able to parse because it simply doesn't know the new extension method. This might be the cause why you cannot see the combined filters after you composed them (although I would expect an exception).

When does the solution with the Any operator work:

In the comments, you told that you encountered a System.NotSupportedException: Unable to create a constant value of type 'RatePeriod'. Only primitive types or enumeration types are supported in this context. This is the case when the RatePeriod objects are in-memory objects and not tracked by the Entity Framework ObjectContext or DbContext. I made a small test solution that can be downloaded from here: https://dl.dropboxusercontent.com/u/14810011/LinqToEntitiesOrOperator.zip

I used Visual Studio 2012 with LocalDB and Entity Framework 5. To see the results, open the class LinqToEntitiesOrOperatorTest, then open Test Explorer, build the solution and run all tests. You will recognize that ComplexOrOperatorTestWithInMemoryObjects will fail, all the others should pass.

The context I used looks like this:

public class DatabaseContext : DbContext
{
    public DbSet<Post> Posts { get; set; }
    public DbSet<RatePeriod> RatePeriods { get; set; }
}
public class Post
{
    public int ID { get; set; }
    public DateTime PostDate { get; set; }
}
public class RatePeriod
{
    public int ID { get; set; }
    public DateTime From { get; set; }
    public DateTime To { get; set; }
}

Well, it is as simple as it gets :-). In the test project, there are two important unit test methods:

    [TestMethod]
    public void ComplexOrOperatorDBTest()
    {
        var allAffectedPosts =
            DatabaseContext.Posts.Where(
                post =>
                DatabaseContext.RatePeriods.Any(period => period.From < post.PostDate && period.To > post.PostDate));

        Assert.AreEqual(3, allAffectedPosts.Count());
    }

    [TestMethod]
    public void ComplexOrOperatorTestWithInMemoryObjects()
    {
        var inMemoryRatePeriods = new List<RatePeriod>
            {
                new RatePeriod {ID = 1000, From = new DateTime(2002, 01, 01), To = new DateTime(2006, 01, 01)},
                new RatePeriod {ID = 1001, From = new DateTime(1963, 01, 01), To = new DateTime(1967, 01, 01)}
            };

        var allAffectedPosts =
            DatabaseContext.Posts.Where(
                post => inMemoryRatePeriods.Any(period => period.From < post.PostDate && period.To > post.PostDate));
        Assert.AreEqual(3, allAffectedPosts.Count());
    }

Notice that the first method passes while the second one fails with the exception mentioned above, although both methods do exactly the same thing, except that in the second case I created rate period objects in memory the DatabaseContext does not know about.

What can you do to solve this problem?

  1. Do your RatePeriod objects reside in the same ObjectContext or DbContext, respectively? Then use them right from it like I did in the first unit test mentioned above.

  2. If not, can you load all your posts at once or would this result in an OutOfMemoryException? If not, you could use the following code. Notice the AsEnumerable() call that results in the Where operator being used against the IEnumerable<T> interface instead of IQueryable<T>. Effectively, this results in all posts being loaded into memory and then filtered:

    [TestMethod]
    public void CorrectComplexOrOperatorTestWithInMemoryObjects()
    {
        var inMemoryRatePeriods = new List<RatePeriod>
            {
                new RatePeriod {ID = 1000, From = new DateTime(2002, 01, 01), To = new DateTime(2006, 01, 01)},
                new RatePeriod {ID = 1001, From = new DateTime(1963, 01, 01), To = new DateTime(1967, 01, 01)}
            };
    
        var allAffectedPosts =
            DatabaseContext.Posts.AsEnumerable()
                           .Where(
                               post =>
                               inMemoryRatePeriods.Any(
                                   period => period.From < post.PostDate && period.To > post.PostDate));
        Assert.AreEqual(3, allAffectedPosts.Count());
    }
    
  3. If the second solution is not possible, then I would recommend to write a TSQL stored procedure where you pass in your rate periods and that forms the correct SQL statement. This solution is also the most performant one.

Anyways, I think dynamic LINQ query creation was not as simple as I thought. Try using Entity SQL, similar to the way below:

var filters = new List<string>();
foreach (var ratePeriod in ratePeriods)
{
    filters.Add(string.Format("(it.Date >= {0} AND it.Date <= {1})", ratePeriod.DateFrom, ratePeriod.DateTo));
}

var filter = string.Join(" OR ", filters);
var result = query.Where(filter);

This may not be exactly correct (I haven't tried it), but it should be something similar to this.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top