So I wonder should I add non-clustered index to a non-unique values column in SQL 2008 R2 table. Simplified Example:

 SELECT Id, FirstName, LastName, City
 FROM Customers
 WHERE City = 'MyCity'

My understanding is that the primary key [Id] should be the clustered index.

Can non-clustered index be added to the non-unique column [City] ? Is this going to improve performance or I should not bother at all.

Thanks.

I was thinking to do clustered index as:

 CREATE UNIQUE CLUSTERED INDEX IDX_Customers_City 
  ON Customers (City, Id);

or non-clustered, assuming there is already clustered index on that table.

  CREATE NONCLUSTERED INDEX IX_Customers_City 
  ON Customers (City, Id);

In reality I am dealing with millions of records table. The Select statement returns 0.1% to 5% of the records

有帮助吗?

解决方案

Generally yes - you would usually make the clustered index on the primary key. The exception to this is when you never make lookups based on the primary key, in which case putting the clustered index on another column might be more pertinent.

You should generally add non-clustered indexes to columns that are used as foreign keys, providing there's a reasonably amount of diversity on that column, which I'll explain with an example.

The same applies to columns being used in where clauses, order by etc.

Example

CREATE TABLE Gender (
 GenderId INT NOT NULL PRIMARY KEY CLUSTERED
 Value NVARCHAR(50) NOT NULL)

INSERT Gender(Id, Value) VALUES (1, 'Male'), (2, 'Female')

CREATE TABLE Person (
  PersonId INT NOT NULL IDENTITY(1,1) PRIMARY KEY CLUSTERED,
  Name NVARCHAR(50) NOT NULL,
  GenderId INT NOT NULL FOREIGN KEY REFERENCES Gender(GenderId)
)

CREATE TABLE Order (
  OrderId INT NOT NULL IDENTITY(1,1) PRIMARY KEY CLUSTERED,
  OrderDate DATETIME NOT NULL DEFAULT GETDATE(),
  OrderTotal DECIMAL(14,2) NOT NULL,
  OrderedByPersonId INT NOT NULL FOREIGN KEY REFERENCES Person(PersonId)
)

In this simple set of tables it would be a good idea to put an index on The OrderedByPersonId Column of the Order table, as you are very likely to want to retrieve all the orders for a given person, and it is likely to have a high amount of diversity. By a high amount of diversity (or selectiveness) I mean that if you have say 1000 customers, each customer is only likely to have 1 or 2 orders each, so looking up all the values from the order table with a given OrderedByPersonId will result in only a very small proportion of the total records in that table being returned.

By contrast there's not much point in putting an index on the GenderId column in the Person table, as it will have a very low diversity. The query optimiser would not use such an index, and INSERT/UPDATE statements would be a marginally slower because of the extra need to maintain the index.

Now to go back to your example - the answer would have to be "it depends". If you have hundreds of cities in your database then yes, it might be a good idea to index that column If however you only have 3 or 4 cities, then no - don't bother. As a guideline I might say if the selectivity of the column is 0.9 or higher (ie a where clause selecting a single value in the column would result in only 10% or less of the rows being returned) an index might help, but this is by no means a hard and fast figure!

Even if the column is very selective/diverse you might not bother indexing it if queries are only made very infrequently on it.

One of the easiest things to do though is try your queries with the execution plan displayed in SQL management studio. It will suggest indexes for you if the query optimiser thinks that they'll make a positive impact.

Hope that helps!

其他提示

If you use the query frequently or if you sort by city regularly in on-line applications specially if your table is dense or has a large row size, it makes sense to add an index. Too many indexes slow down your insert and update. An evaluation of the actual value would only be appreciated when you have significant data in the table.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top