Question

I need to pull a large amount of data from various tables across a line that has very low bandwidth. I need to minimize the amount of data that gets sent too and fro.

On that side is a Sybase database, on this side SQL Server 2008.

What I need is to pull all the tables from the Sybase database that have to do with this office. Lets say I have the following tables as an example:

Farm 
Tree 
Branch 
etc.

(one farm has many trees, one tree has many branches etc.)

Lets say the "Farm" table has a field called "CountryID", and I only want the data for where CountryID=12. The actual table structures I am looking at are very complex (and I am also not very familiar with them) so I want to try to keep the queries simple.

So I am thinking of setting up a series of views:

CREATE VIEW vw_Farm AS 
SELECT * from Farm where CountryID=12

CREATE VIEW vw_Tree AS 
SELECT * from Tree where FarmID in (SELECT FarmID FROM vw_Farm)

CREATE VIEW vw_Branch AS 
SELECT * from Tree where BranchID in (SELECT BranchID FROM vw_Branch)

etc.

To then pull the actual data across I would then do:

SELECT * from vw_Farm into localDb.Farm
SELECT * from vw_Tree into localDb.Tree
SELECT * from vw_Branch into localDb.Branch

etc.

Simple enough to set up. I am wondering how this will perform though? Will it perform all the SELECT statements on the Sybase side and then just send back the result? Also, since this will be an iterative process, is it possible to index the views for subsequent calls?

Any other optimisation suggestions would also be welcome!

Thanks
Karl

EDIT: Just to clarify, the views will be set up in SQL Server. I am using a linked server using Sybase ASE to set up those views. What is worrying me in particular is whether the fact that the view is in SQL Server on this side and not on Sybase on that side will mean that for each iteration the data from the preceeing view will get pulled across to SQL Server first before the calculations get executed. I want Sybase to do all the calcs and just pass the results across.

Was it helpful?

Solution

It's difficult to be certain without testing, but my somewhat-relevant experience (using linked servers to platforms other than Sybase, and on SQL Server 2005) has been that using subqueries (such as your code for vw_Tree and vw_Branch) more or less guarantees that SQL Server will pull all the data for the outer table into a local temp table, then match it to the results of the inner query.

The problem is that SQL Server has no access to the linked server's table statistics, so can make no meaningful decisions about how to optimise the query.

If you want to be sure to have the work done on the Sybase server, your best bet will be to write code (could be views or stored procedures) on the Sybase side and reference them from SQL Server.

Linked server connections are, in my experience, not particularly resilient over flaky networks. If it's available, you could consider using Integration Services rather than linked-server queries - but even that may not be much better. You may need to consider falling back on moving text files with robocopy and bcp.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top