Question

I have a pretty big temp table (~5 million rows) that I need to update a few fields on.

Here's an example update statement:

UPDATE T    
SET CarModel = 
    CASE 
        WHEN CarModel = '350z' AND ModelYear = 2008 AND CarMake NOT LIKE '%DATSUN%' THEN '370z Nismo' 
        ELSE CarModel
    END,
IsMidTierCar = 
    CASE 
        WHEN NumberOfCarsSold >= 500000 AND <= 1000000 THEN IsMidTierCar 
        ELSE NULL 
    END,
IsRareCarMakeModel = 
    CASE 
        WHEN CarModel = '350z' AND SpecialEditioName != '' AND ModelYear = 2008 THEN 1
        ELSE 0
    END
FROM #CarsTempTable AS T
WHERE IsRareCarMakeModel IS NULL

The catch is the IsRareCarMakeModel field is NULL for every single row in the temp table before this update runs. Right now the fastest time I see it run in is when I don't create any index on my temp table. Any other index combination I've tried runs in at least 5x as long (and usually I just end up killing it instead of waiting for it to finish.)

Is no index going to be the best I'll ever get because of the cardinality of the temp table's IsRareCarMakeModel field is extremely low therefor a table scan is always going to be the best course of action for the optimizer?

Was it helpful?

Solution

Ronaldo's comment got me thinking and the more I think about it, I would answer my own question as: if every row of a table needs to be read, then there is no way to order the way that data is stored in order to improve the read performance. In other words, there's no index that would make the update statement in my scenario any faster because every row needs to be read to be updated in that query (regardless of the fact that the query has a filter because the filter doesn't actually return any less rows than everything in the table.)

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top