Indexing to reduce cost of SORT

https://stackoverflow.com/questions/21509280

06-10-2022
|

Question

I have this table:

TopScores

Username char(255)
Score int
DateAdded datetime2

which will have a lot of rows.

I run the following query (code for a stored procedure) against it to get the top 5 high scorers, and the score for a particular Username preceded by the person directly above them in position and the person below:

WITH Rankings
     AS (SELECT Row_Number() OVER (ORDER BY Score DESC, DateAdded DESC) AS Pos,
                --if score same, latest date higher
                Username,
                Score
         FROM   TopScores) 
SELECT TOP 5 Pos,
             Username,
             Score
FROM   Rankings
UNION ALL
SELECT Pos,
       Username,
       Score
FROM   Rankings
WHERE  Pos BETWEEN (SELECT Pos
                    FROM   Rankings
                    WHERE  Username = @User) - 1 AND (SELECT Pos
                                                      FROM   Rankings
                                                      WHERE  Username = @User) + 1

I had to index the table so I added clustered: ci_TopScores(Username) first and nonclustered: nci_TopScores(Dateadded, Score).

Query plan showed that clustered was completely ignored (before I created the nonclustered I tested and it was used by the query), and logical reads were more (as compared to a table scan without any index).

Sort was the highest costing operator. So I adjusted indexes to clustered: ci_TopScores(Score desc, Dateadded desc) and nonclustered: nci_TopScores(Username).

Still sort costs the same. Nonclustered: nci_TopScores(Username) is completely ignored again.

How can I avoid the high cost of sort and index this table effectively?

Solution

The CTE does not use Username so not a surprise it does not use that index.

A CTE is just syntax. You are evaluating that CTE 4 times.

Try a #temp so it is only evaluated once.
But you need to think about the indexes.
I would skip the RowNumber and just put an iden pk on the #temp to serve as pos
I would skip any other indexes on #temp

For TopScores an index on Score desc, DateAdded desc, Username asc will help
But it won't help if it is fragmented
That is an index that will fragment when you insert

insert into #temp (Score, DateAdded, Username)   
select Score, DateAdded, Username
 from TopScores
order by Score desc, DateAdded desc, Username asc  

select top 5 * 
  from #temp 
 order by pos 
union 
select three.* 
from #temp 
join #temp as three
  on #temp.UserName = @user 
 and abs(three.pos - #temp.pos) <= 1

So what if there is table scan on #temp UserName.
One scan does not take as long as create one index.
That index would be severely fragmented anyway.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow