Question

I'm trying to deepen my education about IQueryable custom providers and expression trees. I'm interested in custom parsing a cross-join (viz. SelectMany), and I'm trying to understand if that is exactly what EF is doing when it handles this:

var infoQuery =
    from cust in db.Customers
    from ord in cust.Orders
    where cust.City == "London"
    select ord;

Allegedly EF can handle cross joins, though the syntax in that link does not look right to me. Then I found a link with the title "Cross Product Queries" for EF. The syntax looks "correct," but the article itself speaks as if these are normal inner joins rather than cross joins.

Indeed, the code snippet above comes from that last article - and leaves me wondering if EF simply says "I know how these two entities are related, so I will form the inner-join automatically."

What is the real story with EF and this alleged "cross join" sample?

footnote

As I try to build my own IQueryable LINQ provider, the educational goal I've set for myself is to make my own query context for the code snippet above, so that when ToList() is called on the query:

  1. A Console.WriteLine() is automatically fired that prints "This is a cross join of:Customer and Order
  2. The == operator is magically converted into a != before the query is fully interpreted (by an ExpressionVisitor perhaps, not sure).

If someone knows of articles or has snippets of code that would speed my educational goal, please do share! :)

Was it helpful?

Solution

Look closely at the syntax:

from cust in db.Customers
from ord in cust.Orders       // cust.
select ...

Because of cust.Orders this is a regular inner join. It's even the preferred way to do a join, because it is far more succinct than the regular join statement.

I don't understand the title of this "Cross-Product Queries" article. Firstly, because as far as I know "cross product" applies to three-dimensional vectors not relational algebra. Secondly, because there is not a single cross join in the examples, only inner joins. Maybe what they're trying to say is that the above syntax looks like a cross join? But it isn't, so it's only confusing to use the word so prominently in the title.

This code snippet

from cust in db.Customers
from ord in db.Orders         // db.
select ...

is a true cross join (or Cartesian product). If there are n customers and m orders the result set contains n * m rows. Hardly useful with orders and customers, but it can be useful to get all combinations of elements in two sequences. This construct can also be useful if you want to join but also need a second condition in the join, like

from cust in db.Customers
from ord in db.Orders
where cust.CustomerId == ord.CustomerId && ord.OrderDate > DateTime.Today

Which effectively turns it into an inner join. Maybe not the best example, but there are cases where this comes in handy. It is not supported in the join - on - equals syntax.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top