When does MS-SQL maintain table indexes?

https://stackoverflow.com/questions/3761609

04-10-2019
|

Question

For arguments sake, lets say it's for SQL 2005/8. I understand that when you place indexes on a table to tune SELECT statements, these indexes need maintaining during INSERT / UPDATE / DELETE actions.

My main question is this:

When will SQL Server maintain a table's indexes?

I have many subsequent questions:

I naively assume that it will do so after a command has executed. Say you are inserting 20 rows, it will maintain the index after 20 rows have been inserted and committed.

What happens in the situation where a script features multiple statements against a table, but are otherwise distinct statements?
Does the server have the intelligence to maintain the index after all statements are executed or does it do it per statement?

I've seen situations where indexes are dropped and recreated after large / many INSERT / UPDATE actions.

This presumably incurs rebuilding the entire table's indexes even if you only change a handful of rows?
Would there be a performance benefit in attempting to collate INSERT and UPDATE actions into a larger batch, say by collecting rows to insert in a temporary table, as opposed to doing many smaller inserts?
How would collating the rows above stack up against dropping an index versus taking the maintenance hit?

Sorry for the proliferation of questions - it's something I've always known to be mindful of, but when trying to tune a script to get a balance, I find I don't actually know when index maintenance occurs.

Edit: I understand that performance questions largely depend on the amount of data during the insert/update and the number of indexes. Again for arguments sake, I'd have two situations:

An index heavy table tuned for selects.
An index light table (PK).

Both situations would have a large insert/update batch, say, 10k+ rows.

Edit 2: I'm aware of being able to profile a given script on a data set. However, profiling doesn't tell me why a given approach is faster than another. I am more interested in the theory behind the indexes and where performance issues stem, not a definitive "this is faster than that" answer.

Thanks.

Solution

When your statement (not even transaction) is completed, all your indexes are up-to-date. When you commit, all the changes become permanent, and all locks are released. Doing otherwise would not be "intelligence", it would violate the integrity and possibly cause errors.

Edit: by "integrity" I mean this: once committed, the data should be immediately available to anyone. If the indexes are not up-to-date at that moment, someone may get incorrect results.

As you are increasing batch size, your performance originally improves, then it will slow down. You need to run your own benchmarks and find out your optimal batch size. Similarly, you need to benchmark to determine whether it is faster to drop/recreate indexes or not.

Edit: if you insert/update/delete batches of rows in one statement, your indexes are modified once per statement. The following script demonstrates that:

CREATE TABLE dbo.Num(n INT NOT NULL PRIMARY KEY);
GO
INSERT INTO dbo.Num(n)
SELECT 0
UNION ALL
SELECT 1;
GO
-- 0 updates to 1, 1 updates to 0
UPDATE dbo.Num SET n = 1-n;
GO
-- doing it row by row would fail no matter how you do it
UPDATE dbo.Num SET n = 1-n WHERE n=0;
UPDATE dbo.Num SET n = 1-n WHERE n=1;

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow