Why is query using Clustered Index when it shouldn't?

https://dba.stackexchange.com/questions/6801

16-10-2019
|

Question

Let us presume I have a table named Category in a SQL Server 2005 database. Category has category_id (bigint, identity) as its primary key and name (nvarchar(50)). There is obviously a clustered index on category_id, and I have also added a non-clustered index on name. If I run the query

SELECT * 
FROM Category 
WHERE [name] like '%zz%'

and look at the Execution Plan via Management Studio 2005, it says it is using a Clustered Index scan. The clustered index is on category_id, not on name.

Why is it saying that?

Solution

The "like '%zzz%'" largely negates the possible benefit of the index since it will need to examine each entry to determine whether it matches; and doing the "SELECT *" means it would have to do a lookup to the clustered index to get all of the column data on the matched records; the optimizer can't estimate how many matches there will be so chooses the clustered index.

Try changing it to "like 'zz%'" and see what you get.

OTHER TIPS

Because in your example, [category_id] is the PK (identity), and such PKs have a clustered index related to them by default (in MS-SQL you can have only 1 clustered index).

Also, as you don't mention that [name] has an index, SQL only really has the PK available for a scan (and thus is looking at every row at 100% cost).

BTW: Here is an intriguing article on Indexes and LIKE: http://myitforum.com/cs2/blogs/jnelson/archive/2007/11/16/108354.aspx )

EDIT - Added a snippet of the link above that sums the article up, as links can become obsolete:

Author: Number2 (John Nelson)

...the rules for index usage with LIKE are loosely like this:

1) If your filter criteria uses equals = and the field is indexed, then most likely it will use an INDEX/CLUSTERED INDEX SEEK

2) If your filter criteria uses LIKE, with no wildcards (like if you had a parameter in a web report that COULD have a % but you instead use the full string), it is about as likely as #1 to use the index. The increased cost is almost nothing.

3) If your filter criteria uses LIKE, but with a wildcard at the beginning (as in Name0 LIKE '%UTER') it's much less likely to use the index, but it still may at least perform an INDEX SCAN on a full or partial range of the index.

4) HOWEVER, if your filter criteria uses LIKE, but starts with a STRING FIRST and has wildcards somewhere AFTER that (as in Name0 LIKE 'COMP%ER'), then SQL may just use an INDEX SEEK to quickly find rows that have the same first starting characters, and then look through those rows for an exact match.

NB: (Also keep in mind, the SQL engine still might not use an index the way you're expecting, depending on what else is going on in your query and what tables you're joining to. The SQL engine reserves the right to rewrite your query a little to get the data in a way that it thinks is most efficient and that may include an INDEX SCAN instead of an INDEX SEEK)

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange