Why does the optimizer choose nested loops over merge joins here?

https://dba.stackexchange.com/questions/188448

09-10-2020
|

Question

I have 3 tables. #a is main one and two secondary tables, #b and #c.

create table #a (a int not null, primary key (a asc)) ;
create table #b (b int not null, primary key (b asc)) ;
create table #c (c int not null, primary key (c asc)) ;

insert into #a (a)
select x*10 + y
from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9))x(x)
cross join (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9))y(y) ;

insert into #b (b)
select a from #a where a % 5 > 0 ;

insert into #c (c)
select a from #a where a % 4 > 0 ;

If I join main table #a with only one secondary table, there will be Merge Join in query plan.

select *
from #a a
inner join #b b on a = b ;

But if I join main table #a with both secondary tables, there will be only Nested Loops.

select *
from #a a
inner join #b b on a = b
inner join #c c on a = c ;

Why does it work like that and what should I do to get two Merge Joins?

Without inner merge join hint.

Solution

Why does it work like that and what should I do to get two Merge Joins?

With three table references (the minimum required), the query qualifies for the Transaction Processing (aka search 0) stage of cost-based optimization.

This stage is aimed at OLTP queries, which normally benefit from a navigational (index-based) strategy. Nested Loops Join is the main physical join type available (hash and merge are only considered if no valid nested loops plan can be found in this stage).

If this stage finds a low cost (good enough) plan, cost-based optimization stops there. This prevents spending more time in optimization that we can expect to save over the best solution found so far. If the cost exceeds a threshold, the optimizer moves on to the Quick Plan (search 1), parallel Quick Plan, and Full Optimization (search 2) phases.

The query with two table references did not qualify for Transaction Processing, and went straight into Quick Plan, where Merge and Hash joins are available.

See my Query Optimizer Deep Dive series for more information.

Without inner merge join hint.

If you absolutely must hint a physical join type, strongly prefer OPTION (MERGE JOIN). This allows the optimizer to still consider changing the join order.

Join hints like INNER MERGE JOIN come with an implied OPTION (FORCE ORDER), which severely limits the optimizer's freedom, with consequences most people (including experts) do not appreciate.

OTHER TIPS

I think the reason is because you have not enough data in your tables, so SQL Server does not chooses the BEST plan, it chooses Good Enough plan to execute your query.

I try following

create table #a (a int not null, primary key (a asc))
create table #b (b int not null, primary key (b asc))
create table #c (c int not null, primary key (c asc))

insert into #a (a)
select h+10*g
from  
(values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9))g(g)
cross join (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9))h(h)


insert into #b (b)
select a from #a where a % 5 > 0

insert into #c (c)
select a from #a where a % 4 > 0

Now if I run

select *
from #a a
inner join #b b on a = b
inner join #c c on a = c

Here is the execution plan

If I use merge join hint I get the following execution plan.

select *
from #c c
inner merge join #a a on a = c
inner merge join #b b on a = b

So it is clear that merge join will be better in this case. Estimated subTree cost is less for the query with merge joins, but both this queries are fast enough, there both do only 12 logical reads, so SQL Server decides that NESTED LOOP join is also good solution.

Not let add more data to our tables.

create table #a (a int not null, primary key (a asc))
create table #b (b int not null, primary key (b asc))
create table #c (c int not null, primary key (c asc))

insert into #a (a)
select h+10*g + 100*f + 1000*e+ 10000*z + 100000*y + 1000000*x
from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9))x(x)
cross join (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9))y(y)
cross join (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9))z(z)
cross join (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9))e(e)
cross join (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9))f(f)
cross join (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9))g(g)
cross join (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9))h(h)


insert into #b (b)
select a from #a where a % 5 > 0

insert into #c (c)
select a from #a where a % 4 > 0

Now we have 10mln data in #a, 8mln in #b and 6mln in #c. If I run the original query without any hints I get merge joins as expected.

select *
from #a a
inner join #b b on a = b
inner join #c c on a = c

If one join input is small (fewer than 10 rows) and the other join input is fairly large and indexed on its join columns, an index nested loops join is the fastest join operation because they require the least I/O and the fewest comparisons.

If the two join inputs are not small but are sorted on their join column (for example, if they were obtained by scanning sorted indexes), a merge join is the fastest join operation.

Hash joins can efficiently process large, unsorted, nonindexed inputs.

Advanced Query Tuning Concepts

LOOP, HASH and MERGE Join Types

Understanding SQL Server Physical Joins

Optimizer chooses between merge/nested loop/hash join based on existing statistics, tables size, and presence of indexes. In general nested loop is preferable if of the inputs is much smaller than other, and they are both indexed on join column, merge will be better if size of two inputs are pretty equal and indexed. Populating tables involved with more values (couple hundred thousands rows ) will very likely incline optimizer to choose merge join algorithm. You may find more details on how it's implemented in SQLServer here.
You may also try hints (say FORCE ORDER ), but even if it works one time, it's not guaranteed it'll always generate the same plan .

Also, keep in mind that optimizer's task is not to find the best plan, but to return the plan which is good enough in timely fashion.

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange