Question

This is a question of concept, and I am just moving from MS Access to SQL Server for stability and scalability.

I need to maintain a database that pulls from another server daily. Due to the possibility (and probability) of record changes on the other server, I have to pull using a 10 day rolling window with the expectation that anything older than 10 days will not change by policy.

The pull is in stages, getting just the records within a date range on the initial pass, then moving to other tables one at a time to pull relevant and relation data.

I have written a script that works with date range variables. If I set the range to 10 days, it pulls everything in about 8 hours. In a test to see if looping might be better, having the script loop starting with today -10, then continue until the while < today, it takes 16 hours to do 3 days.

Being new, I am curious of the logical reason why the looping by date approach is so much slower. My thought was to try to mitigate impact on the other server, but maybe this isn't conceptually the case.

The code works perfectly in both cases with the only difference being looped or all at once.

Thanks for any insight on this!

Was it helpful?

Solution

Since you are pulling from another server, you are probably using a linked server. Also, since it's the most intuitive way to do it, you are probably doing something like this:

select somefields
from ServerName.DatabaseName.Owner.TableName
where whatever

With this syntax, sql server brings the entire contents of the remote table to the local server first, and then applies the where clause. If the remote table has a lot of data, it takes a long time to transfer it over.

When you ran your original query, the data was transferred once. When you set up your loop, it was transferred more than once. That's why it took more time.

To speed up production with linked servers, use openquery.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top