Populating SQL Server databases and creating indexes - which is the most efficient way?

https://stackoverflow.com/questions/23331011

10-07-2023
|

Question

We've got a project site where we have to replicate a legacy database system into SQL Server 2008 on a nightly basis.

We are using the SQL DataWizard tool from Maestro to do the job, and because we cannot get an accurate delta every night, it was decided that we would dump the previous SQL Server database and take a fresh snapshot every night. Several million rows in about 10 different tables. The snapshot takes about 2 hours to run.

Now, we also need to create some custom indexes on the snapshot copy of the data, so that certain BI tools can query the data quickly.

My question is: is it more efficient to create the tables AND the indexes before the snapshot copy is run, or do we just create the table structures first, run the snapshot copy then create the indexes after the tables are populated?

Is there a performance different in the SQL Server database building the index WHILE adding rows vs adding all rows first then creating the indexes on the final data set?

Just trying to work out which way will result in less database server CPU overhead.

Solution

When you perform a snapshot replication, the first task is to bulk copy the data. After the data has been copied, primary and secondary indexes are added. The indexes don't exists until the second step is complete. So no, there is no improvement gain by applying an index after the snapshot.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow