How does column order of a clustered Index affect performance

https://stackoverflow.com/questions/13615816

03-12-2021
|

Question

I have a simple mapping table with two foreign key columns (CategoryId int, ProductId int). The Primary Key is applied to both columns.

While each Product can have more than one category, it is uncommon to ever have more than 2. Categories, on the other hand, commonly have 10k+ products.

How does the order of the columns in the Primary Key affect performance?

Common usage of the table is to products based on a category:

SELECT ProductId FROM [table] WHERE CategoryId = @catid

I understand that if this were a Non-Clustered Index, I would want CategoryId first to get best performance from the above query. Does the same hold true with Clustered Indexes?

Solution

Yes, the same is true for the clustered index. The clustering determines the physical order of the rows. Laying out the table sequentially like this helps I/O on most tables, because the rows and pages can be read by fast sequential I/O instead of random access.

In this case, you could define a clustered index on (CategoryId, ProductId) and a non-clustered index on (ProductId, CategoryId) if you also need to get the categories for a single product. Note how both indexes have the same keys. If the mapping table has only 2 columns, the index pages on both indexes will have exactly the same data-- just ordered differently. The non-clustered index should perform very well here, because SQL Server will not need to do a bookmark lookup to get other data in the row.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow