Question

I have a table with the following structure:

ItemID           int
ItemType         char(1)
Language         char(2)
Localization     char(2)
Literal          nvarchar(4000)

which is used to translate the text of specific ItemID to Literal one. The Literal column may contain HTML tags. For a particular set of queries, these HTML tags needs to be removed and because the sanitization is done with SQL CLR over thousands of rows, I prefer not to perform such on read.

So, I have added a persisted column like this:

ALTER TABLE [dbo].[table]
ADD [LiteralSanitized] AS NULLIF(CAST(LTRIM(RTRIM([dbo].[fn_Utils_RemoveAllHtmlTags] ([Literal]))) AS NVARCHAR(4000)), '') PERSISTED;

The table has only one index (the primary key) with the following definition:

ItemID, ItemType, Language, Localization

So the query was faster, but I am seeing some additional reads for this table:

Scan count    - 2     vs 12,230 
Logical reads - 3,234 vs 43,472 

which might be normal as now I reading more data because of the column. So, I have added the following index:

(ItemID ASC, ItemType ASC, Language ASC, Localization ASC) INCLUDE ([LiteralSanitized])

but it is not used by the engine. So, I try to force the engine to use it:

UPDATE #temp
SET [QuestionText] = PSGQ.[LiteralSanitized]
FROM #temp PQD                                            
INNER JOIN [dbo].[table_with_translations] PSGQ WITH(INDEX = [the_new_index])
    ON PQD.[ProtoQuestionID] = PSGQ.[ItemID] 
WHERE PSGQ.[ItemType] = 'Q' 
    AND PSGQ.[Language] = @language 
    AND RTRIM(PSGQ.[Localization]) = ''
    AND PSGQ.[LiteralSanitized] IS NOT NULL;

but the engine is doing the following:

  • performs index scan (on my new index)
  • then performs nested loop and key lookup with the clustered index
  • extract the literal column

If my column is Persisted, why the engine continues to try to return the Literal column as it is even not needed?

enter image description here

Was it helpful?

Solution

The most likely answer to your question lies in this blog post from Paul White: Properly Persisted Computed Columns

The lack of optimizer cost model support means SQL Server assigns a small fixed cost to scalar computations, regardless of complexity or implementation. As a consequence, the server often decides to recompute a stored computed column value instead of reading the persisted or indexed value directly. This is particularly painful when the computed expression is expensive, for example when it involves calling a scalar user-defined function.

It sounds very much like SQL Server has decided that it would rather recompute the column value rather than read it from disk.

One solution, as mentioned by Paul, is to use trace flag 176 to disable computed column expansion.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top