سؤال

To take backup of a big table (using SELECT .. INTO .. ) took me almost 4 hours in a machine with 4 CPUs and 16 GB of RAM. No external application/process were accessing the table during the operation.

The table size was 220 GB and the SELECT .. INTO was a simple one (i.e. SELECT * INTO BACKUP_TABLE FROM ORIGINAL_TABLE)

This was a test environment and based on that, I need to work out an estimation of the execution time for the same operation in the production environment. The production environment has 40 CPUs and 64 of RAM.

CPUs are identical and I/O systems for both systems are the same. (i.e. disk type and disk layouts are the same).

Is it realistic to estimate the production's SELECT .. INTO .. will be complete in less than an hour, considering the production server has 10 times more processing power?

If not possible to answer this question based on the above, shall I re-run the test and collect some metrics? If yes, what those metrics should be?

Thanks in advance, for providing your thoughts on this!

هل كانت مفيدة؟

المحلول

On SQL Server 2008 R2, SELECT...INTO queries are not eligible for parallelism (this was introduced in SQL Server 2014). So the increased CPU count in production will not help your overall runtime unfortunately.

You could re-run your test with a pre-created BACKUP_TABLE, and then use INSERT INTO...SELECT to see how parallelism affects the test.


Regardless of parallelism, since you're copying a 220 GB table, it's likely that your main bottleneck is disk speed. Specifically writing all of this to the transaction log. I'd check to make sure that you aren't going to experience file growth events (pre-grow the production log file to whatever the size of the test instance ended up at).

I'd run your test again and measure wait stats to see how much you're hitting WRITELOG waits. You can use Paul Randal's scripts for that:

Capturing wait statistics for a period of time

Run this multiple times while the insert is running, for 30, 60, or 90 second intervals for example, to get a feel for where the bottlenecks are.

You mentioned that your test was run on a system with no other concurrent activity against this table. The other thing that might affect your test is if the SELECT INTO query gets blocked by other processes.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى dba.stackexchange
scroll top