SQL Server - Can I surgically remove a bad cached query plan or am I chasing the wrong idea?

https://dba.stackexchange.com/questions/280998

11-03-2021
|

Question

Let me set the stage:

SQL Server > 2016
Several databases with the same schemas
Data is similar but not identical
The same indexes exist etc., all systems but one work as expected
Examining the execution plan recommends the creation of an index with ALL of the columns as INCLUDE. The same base column already exists without the includes.

The problem:

On ONE of the databases, the query engine does not utilize an existing index and uses a full table scan
Manually executing the query (which uses a parameter) - full table scan!
Manually executing the query without a parameter (used fixed date) - uses the index.
Manually executing the query with a parameter and adding WITH (INDEX(update_ts_INDEX)) - uses the index. The query goes from 2 minutes to less than a second.

Hypothesis:

I believe that the query cache might be corrupt - but I don't know if that is possible. By trade I am a java developer who has spent the last 3 years digging into SQL server performance tuning, etc.
The fact that manually executing the query without a parameter - the query engine picks the right index - blows my mind. Ideas?

Notes:

On the other servers, which run the same code and queries, I can find NO evidence of this same query taking any time. They must be using the index.
I just executed the command manually, in another region with a parameter, and it used the index so the issue is indeed in ONLY one region - one database.

Example of query:

DECLARE @P1 DATETIME = GETDATE() - .1;

SELECT value_1, update_ts, value_2 FROM PRODUCTION_TABLE 
WHERE (PRODUCTION_TABLE.update_ts > @P1);

If I remove the variable and handcode a date, it uses the index.

Question: Can I surgically remove a bad cached query plan for this particular query?

--- UPDATE ---

Using OPTION (RECOMPILE) causes the query engine to pick the correct index.

Solution

You plan probably isn't corrupt, per se. You data likely doesn't like have even distribution of data for that field, and the plan was likely created using an outlier value (or your code is overly complex, but I am guess not, based upon your sample. If it is complex, start with a refactor). Query Store aside, plans get dropped when servers restart, there is memory pressures, etc. Then they get recreated, generally based upon the parameters you supply, which may or may not be an outlier.

You have a couple of options based upon circumstances.

If you have Query Store, use it. Drop the plan, re-run with a representative value and pin it. As note above, the problem with removing a plan geared towards an outlier is that there is a chance it can come back. Pinning a plan avoids that completely.

If this is a infrequently run bit of SQL (a couple times an hour), and runs for a fair amount of time (over a few seconds, which it seems to), just have it recreate a new plan each time and the code isn't too complex; the potential savings are worth the cost. OPTION RECOMPIE

If, however, you have a general representative value for your parameter, you can add the OPTIMIZE FOR @var = '2020-01-01' query hint. This means that when a new plan is regenerated it won't use the supplied value to determine the plan, but the value you supplied. Keep in mind that when you do hit an outlier value (e.g. NULLs in a some times NULLABLE fields) your will not get the ideal plan. This is the route I've gone most often in similar circumstance, and in your case I would simply hard code a recent date (if that makes sense).

There are other things you can do, but in my experience the value of those dramatically falls off quick.

OTHER TIPS

You can!

Run this to find your plan's ID:

select plan_handle, creation_time, last_execution_time, execution_count, qt.text
FROM 
   sys.dm_exec_query_stats qs
   CROSS APPLY sys.dm_exec_sql_text (qs.[sql_handle]) AS qt

Then place that ID in the stored procedure called DBCC FREEPROCCACHE

DBCC FREEPROCCACHE (plan_handle_id_goes_here)

You could always just free the whole thing though. It'll add a few seconds to the first time each query is ran, but no more than what a server reset (or an IISRESEt in a web server) would do anyways.

This is how you clear it all:

DBCC FREESYSTEMCACHE ('ALL') WITH MARK_IN_USE_FOR_REMOVAL;

Check here for more info https://serverfault.com/questions/91285/how-do-i-remove-a-specific-bad-plan-from-the-sql-server-query-cache

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange