Why does this stored proc cause a clustered index scan, but seek when using OPTION RECOMPILE?
-
25-02-2021 - |
Вопрос
I think I may know the answer based on my research, but am looking for confirmation on how/why the engine compiles the plan the way it does with
Parameters being passed in: @ID int ,@OtherID INT
SELECT b.Column1
,b.Column2
,b.Column3
,b.Column4
,b.Column5
,c.Column1
,b.Column1
,e.Column1
FROM Table1 AS b
inner join Table2 AS t
on b.ID = t.ID
left join [LINKED SERVER].[DB].dbo.Table3 as c
on b.ID = c.ID
left join Table4 AS e
on b.ID= e.ID
where (b.ID = @ID or @ID= 0)
And b.ID = @OtherID
And b.ID IS NOT NULL
and e.ID = 1
Now I have determined that the cause of the index scan is because of this line: where (b.ID = @ID or @ID= 0)
. More specifically, @ID = 0. To clarify even further, 0 for that ID field does not exist as a value in the underlying table, it was simply something a developer did to allow a user to pull back all of the results by passing in 0 to the parameter and then checking to see if that parameter is 0 so as a result more rows are pulled back (typically, you would just return 1-3 results).
Now, what is extremely odd, is that if I add OPTION RECOMPILE
, the engine is able to create a much better plan at the cost of overhead (compilation time) of course:
What I would like to know is how is this possible. From what I have read online, by using OPTION RECOMPILE
, the engine will literally replace the value with the actual value passed into the parameter and it can very easily see that @ID 1234 does not equal 0. However, if you don't use OPTION RECOMPILE
the engine will take the total # of records, which is 120,000, and divide it by the total number of distinct possibilities, 107,000. This comes out to about 1.1 estimated rows being returned and I confirmed this by looking at the estimated properties of the plan that has the index scan, but why would the engine continue to index scan if the estimation is correct? I even updated stats just to be sure.
Решение
b.ID = @ID OR @ID = 0
The optimizer has to produce a plan with an index scan, because the plan is cached and reused.
On a subsequent execution, the parameter @ID
might be zero. An index seek is of no value in that case, because there is no ID
value to seek to. Other times, there will be a non-zero value provided for @ID
, but the cached plan has to work correctly for all possible parameter values.
When OPTION (RECOMPILE)
is used, the Parameter Embedding Optimization (PEO) means the current value for @ID
is used in place of the parameter on each execution, and no plan is cached.
Say @ID
is 1234. After PEO, the optimizer sees:
b.ID = 1234 OR 1234 = 0
That is simplified by the contradiction detection logic to:
b.ID = 1234
...which enables a seek on ID
.
For further reading please see my article Parameter Sniffing, Embedding, and the RECOMPILE Options.
Другие советы
Your problem is optional condition (b.ID = @ID or @ID= 0)
If you don't want to use OPTION(RECOMPILE)
, you should split your query on the condition:
IF @ID = 0 BEGIN
SELECT b.Column1,
b.Column2,
b.Column3,
b.Column4,
b.Column5,
c.Column1,
b.Column1,
e.Column1
FROM Table1 AS b
INNER JOIN Table2 AS t ON b.ID = t.ID
LEFT JOIN [LINKED SERVER].[DB].dbo.Table3 AS c ON b.ID = c.ID
LEFT JOIN Table4 AS e ON b.ID = e.ID
WHERE b.ID = @OtherID
AND b.ID IS NOT NULL
AND e.ID = 1;
END ELSE BEGIN
SELECT b.Column1,
b.Column2,
b.Column3,
b.Column4,
b.Column5,
c.Column1,
b.Column1,
e.Column1
FROM Table1 AS b
INNER JOIN Table2 AS t ON b.ID = t.ID
LEFT JOIN [LINKED SERVER].[DB].dbo.Table3 AS c ON b.ID = c.ID
LEFT JOIN Table4 AS e ON b.ID = e.ID
WHERE b.ID = @ID
AND b.ID = @OtherID
AND b.ID IS NOT NULL
AND e.ID = 1;
END