Merits of inserting to temp table and renaming versus just adding/removing columns

https://dba.stackexchange.com/questions/75761

08-12-2020
|

Question

If I add or drop columns from a SQL Server table I presume I get page splits or gaps. Since the size of the row has changed.

When I use RedGate SQL Compare to create conversion scripts its strategy is to create a temporary table, copy all the data into that table, drop the old table, and then rename the temporary table.

I assume this cleans up the pages as all the rows were "perfectly" sequentially inserted.

I recently had a DBA tell me that this "copy and rename" approach is inefficient, expensive and unnecessary.

What are the merits of these two approaches?

Solution

I recommend you read SQL Server table columns under the hood. You'll see that many column DDL operations result in having 'ghost' columns in the table, physical columns that are existing in the table but not visible to the user. A rebuild will remove all these ghost columns.

Most times this is benign, but there are some dark areas that can lead to the issues described in KB2504090:

This issue occurs because the accessor that SQL Server uses to insert data into different partitions recognizes the metadata changes incorrectly. When data is inserted into the new partition that is created after a column is dropped, the number of the maximum nullable columns in the new partition may be one fewer than the number of the maximum nullable columns in the old partition.

I for one shun all schema compare tools and comparison diff based deployment/upgrades. There simply is no one-size-fits-all correct approach with regard to changing schema. Just like Henrik mentions, I've been burned by the 500GB table 'copy', tyvm, no more diffs for me. Instead I recommend migrations, coded as SQL scripts and tested on relevant data size before being deployed. See Version Control and your Database. Rails ActiveRecord Migrations really grok this.

OTHER TIPS

If your change is a meta-data change only, this does not touch all rows in the table, and no page splits occur. As in

ALTER TABLE MyTable Add MyNewColumn varchar(50)

If it is not a meta-data change only, (as in

ALTER TABLE MyTable Add MyNewIntColumn NOT NULL DEFAULT (0)

), then a rebuild of your clustered index will do a "copy and rename" under the hoods.

Some of our tables have 15 billion rows, and it takes 15-20 hours to rebuild the clustered index. I use the "copy and rename" approach, when I cannot accept the 20 hours of downtime waiting for the rebuild. First I copy 14.99 billion rows, then I disable the job that inserts rows, take the remaining 10 millions rows across to the new structure, and finally a rename.

Yes, it is a bit expensive (using my time), but the system remains online for as long as possible.

You rebuild an index like this:

USE AdventureWorks2012;
GO
ALTER INDEX ALL ON HumanResources.Employee REBUILD;
GO

This can be scripted, so it is a lot easier than "copy and rename".

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange