“Persisted” column is not used
-
17-03-2021 - |
Вопрос
I have a table with the following structure:
ItemID int
ItemType char(1)
Language char(2)
Localization char(2)
Literal nvarchar(4000)
which is used to translate the text of specific ItemID
to Literal
one. The Literal
column may contain HTML
tags. For a particular set of queries, these HTML
tags needs to be removed and because the sanitization is done with SQL CLR over thousands of rows, I prefer not to perform such on read.
So, I have added a persisted column like this:
ALTER TABLE [dbo].[table]
ADD [LiteralSanitized] AS NULLIF(CAST(LTRIM(RTRIM([dbo].[fn_Utils_RemoveAllHtmlTags] ([Literal]))) AS NVARCHAR(4000)), '') PERSISTED;
The table has only one index (the primary key) with the following definition:
ItemID, ItemType, Language, Localization
So the query was faster, but I am seeing some additional reads for this table:
Scan count - 2 vs 12,230
Logical reads - 3,234 vs 43,472
which might be normal as now I reading more data because of the column. So, I have added the following index:
(ItemID ASC, ItemType ASC, Language ASC, Localization ASC) INCLUDE ([LiteralSanitized])
but it is not used by the engine. So, I try to force the engine to use it:
UPDATE #temp
SET [QuestionText] = PSGQ.[LiteralSanitized]
FROM #temp PQD
INNER JOIN [dbo].[table_with_translations] PSGQ WITH(INDEX = [the_new_index])
ON PQD.[ProtoQuestionID] = PSGQ.[ItemID]
WHERE PSGQ.[ItemType] = 'Q'
AND PSGQ.[Language] = @language
AND RTRIM(PSGQ.[Localization]) = ''
AND PSGQ.[LiteralSanitized] IS NOT NULL;
but the engine is doing the following:
- performs index scan (on my new index)
- then performs nested loop and key lookup with the clustered index
- extract the
literal
column
If my column is Persisted
, why the engine continues to try to return the Literal
column as it is even not needed?
Решение
The most likely answer to your question lies in this blog post from Paul White: Properly Persisted Computed Columns
The lack of optimizer cost model support means SQL Server assigns a small fixed cost to scalar computations, regardless of complexity or implementation. As a consequence, the server often decides to recompute a stored computed column value instead of reading the persisted or indexed value directly. This is particularly painful when the computed expression is expensive, for example when it involves calling a scalar user-defined function.
It sounds very much like SQL Server has decided that it would rather recompute the column value rather than read it from disk.
One solution, as mentioned by Paul, is to use trace flag 176 to disable computed column expansion.