Performance differences between RID Lookup vs Key Lookup?

https://dba.stackexchange.com/questions/217199

09-01-2021
|

Pergunta

Are there any performance differences between when a non-clustered index uses the clustered index's key to locate the row vs when that table doesn't have a clustered index and the non-clustered index locates the row via the RID?

Does different levels of fragmentation impact this performance comparison as well? (E.g. in both scenarios the tables are 0% fragmented, vs 50%, vs 100%.)

Solução

Leaving aside the fragmentation bogeyeman (it doesn't really matter when doing singleton lookups), the main difference is that an RID specifies the exact page a row is on, while with a Key Lookup you traverse the non-leaf levels of the clustered index to find the target page. Aaron Bertrand did some tests on this in Is a RID Lookup faster than a Key Lookup?

However, Heaps can have forwarded fetches (or records) in them, in which case multiple logical IOs are required to find the target row.

I blogged about this recently, and I'm reproducing the content here to avoid a comment answer.

CREATE TABLE el_heapo 
( 
    id INT IDENTITY, 
    date_fudge DATE, 
    stuffing VARCHAR(3000)
);

INSERT dbo.el_heapo WITH (TABLOCKX) 
      ( date_fudge, stuffing )
SELECT DATEADD(HOUR, x.n, GETDATE()), REPLICATE('a', 1000)
FROM (
SELECT TOP (1000 * 1000)
ROW_NUMBER() OVER (ORDER BY @@SPID)
FROM sys.messages AS m
CROSS JOIN sys.messages AS m2
) AS x (n)

CREATE NONCLUSTERED INDEX ix_heapo ON dbo.el_heapo (date_fudge);

We can look at the table with sp_BlitzIndex

EXEC master.dbo.sp_BlitzIndex @DatabaseName = N'Crap', 
                              @SchemaName = 'dbo', 
                              @TableName = 'el_heapo';

This query will produce bookmark lookups.

SELECT *
FROM dbo.el_heapo AS eh
WHERE eh.date_fudge BETWEEN '2018-09-01' AND '2019-09-01'
AND 1 = (SELECT 1)
OPTION(MAXDOP 1);

Now we can cause some forwarded records to occur:

UPDATE eh
SET eh.stuffing = REPLICATE('z', 3000)
FROM dbo.el_heapo AS eh
WHERE eh.date_fudge BETWEEN '2018-09-01' AND '2019-09-01'
OPTION(MAXDOP 1)

BlitzIndex will show them to us:

And if we re-run the lookup query:

Profiler will show a difference as well:

Outras dicas

Remember that the traversing of the non-leaf levels of the clustered index (which is basically what differs a Key to a rid lookup) will almost all be in memory. SQL Server reads through the root, next level, etc again, again, again and again etc. In other words, they will be super-hot.

Compare that to forwarded records in a heap. You think you found one row, but dar, it is somewhere else. You, so to speak, jump all over the place. So one should be careful to just look at "reads" here. Logical vs physical reads becomes a factor.

And, of course, a very important aspect is if you have heap forwarded records in the first place! Sys.dm_db_index_physical_stats with DETAILED option will tell you.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a dba.stackexchange