Question

I have a query:

var contactInfos = from person in persons

                    join tempDesiredCar in desiredCars on person.contact_id equals tempDesiredCar.groupEntity_id
                                   into tempDesiredCars
                    from desiredCar in tempDesiredCars.DefaultIfEmpty()
                    select new {name = person.name, car = desiredCar.name};

This code translates to SQL:

SELECT [t1].[name], [t19].[name] AS [car]
FROM [dbo].[Person] AS [t1]
CROSS APPLY ((
        SELECT NULL AS [EMPTY]
        ) AS [t6]
    OUTER APPLY (
        SELECT [t18].[name]
        FROM (
            SELECT [t17].[contact_id], [t17].[name]
            FROM (
                SELECT [t7].[contact_id], [t11].[name]
                FROM [dbo].[DesiredCar] AS [t7]
                INNER JOIN (
                    SELECT MAX([t9].[value]) AS [value]
                    FROM (
                        SELECT [t8].[id] AS [value], [t8].[contact_id]
                        FROM [dbo].[DesiredCar] AS [t8]
                        ) AS [t9]
                    GROUP BY [t9].[contact_id]
                    ) AS [t10] ON ([t7].[id]) = [t10].[value]
                LEFT OUTER JOIN [dbo].[FamilyModel] AS [t11] ON [t11].[id] = [t7].[model]
                WHERE [t7].[model] IS NOT NULL
                UNION
                SELECT [t12].[contact_id], [t16].[name]
                FROM [dbo].[DesiredCar] AS [t12]
                INNER JOIN (
                    SELECT MAX([t14].[value]) AS [value]
                    FROM (
                        SELECT [t13].[id] AS [value], [t13].[contact_id]
                        FROM [dbo].[DesiredCar] AS [t13]
                        ) AS [t14]
                    GROUP BY [t14].[contact_id]
                    ) AS [t15] ON ([t12].[id]) = [t15].[value]
                LEFT OUTER JOIN [dbo].[CarBrand] AS [t16] ON [t16].[id] = [t12].[carBrand_id]
                WHERE [t12].[carBrand_id] IS NOT NULL
                ) AS [t17]
            ) AS [t18]
        where [t1].[contact_id] = [t18].[contact_id]
        ) AS [t19])

That code uses Apply. If replace Apply by Left Join code runs faster in a few times. How to force Linq to generate code using the Left Join?

Was it helpful?

Solution

I’ve found the reason why LINQ is using an apply operator. Operands are checking, whether is there dependence from foreign key in that join or there is not. If it's just a table – linq uses "Left Join". LINQ uses "Outer apply" if the subquery contains Group By, Order By with column used in “equals”. Use simple subqueries in LINQ JOIN.

OTHER TIPS

I cannot find a reason why a) join and apply produce different plans and b) LINQ is using an apply at all. This is a strange case.

I tried coming up with workarounds, hope they work:

//project only the needed columns
var contactInfos = from person in persons
                    join tempDesiredCar in
                         desiredCars.Select(x => new { x.groupEntity_id, x.name })
                         on person.contact_id equals tempDesiredCar.groupEntity_id
                         into tempDesiredCars
                    from desiredCar in tempDesiredCars.DefaultIfEmpty()
                    select new { name = person.name, car = desiredCar.name };

//change join to from
var contactInfos = from person in persons
                   from desiredCar in
                         desiredCars.Select(x => new { x.groupEntity_id, x.name })
                                    .Where(x => person.contact_id = x.groupEntity_id)
                                    .DefaultIfEmpty()
                    select new { name = person.name, car = desiredCar.name };

These are just speculative rewritings. I hope they work by coincidence... I know this is not a good way to solve a problem but in this strange case it's the best I can offer.

I inspected the query plans.

The apply plan uses Nested Loops to join the tables together. For each record in the Person table, it reads every record in the DesiredCars table. Do you really need to find the desired car of every person in the database? If not, consider filtering the Person table down a bit first.

The join plan uses Hashmatching to join the tables together. It reads both tables once, producing a map of one table, and using that map on the second table. The optimizer has pointed out a couple of index recommendations. These recommendations may help both plans.

        <MissingIndexes>
          <MissingIndexGroup Impact="12.6549">
            <MissingIndex Database="[AudiCRM]" Schema="[dbo]" Table="[DesiredCar]">
              <ColumnGroup Usage="INEQUALITY">
                <Column Name="[carBrand_id]" ColumnId="13" />
              </ColumnGroup>
              <ColumnGroup Usage="INCLUDE">
                <Column Name="[id]" ColumnId="1" />
                <Column Name="[contact_id]" ColumnId="2" />
              </ColumnGroup>
            </MissingIndex>
          </MissingIndexGroup>
          <MissingIndexGroup Impact="12.6549">
            <MissingIndex Database="[AudiCRM]" Schema="[dbo]" Table="[DesiredCar]">
              <ColumnGroup Usage="INEQUALITY">
                <Column Name="[model]" ColumnId="3" />
              </ColumnGroup>
              <ColumnGroup Usage="INCLUDE">
                <Column Name="[id]" ColumnId="1" />
                <Column Name="[contact_id]" ColumnId="2" />
              </ColumnGroup>
            </MissingIndex>
          </MissingIndexGroup>
        </MissingIndexes>
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top