Query tuning - SQL Server

https://dba.stackexchange.com/questions/13647

16-10-2019
|

Pergunta

One of our developers are trying to run the below query on a development server, which involves pulling data from a linked server, production. The query ran for more than 14 hours before it was stopped.

I looked at the execution plan in SQL Sentry Plan Explorer - please find the execution plan below.

How can this query be tuned for better performance? Are there any glaring errors in the query? Are there any pointers, blogs posts that will help me improve this query?

Both the servers involved are SQL Server 2005

SELECT A.SETID
,A.CUST_ID
,A.CNTCT_SEQ_NUM
,A.NAME1
,A.TITLE
,C.DESCR
FROM PS_CUST_CONTACT A
,[linksrv].[prodDB].dbo.PS_BO_ROLE Z
,[linksrv].[prodDB].dbo.PS_RD_PERSON B
,[linksrv].[prodDB].dbo.PS_BO_ROLE_TYPE C
WHERE Z.BO_ID = B.BO_ID
AND Z.ROLE_TYPE_ID = C.ROLE_TYPE_ID
AND Z.ROLE_END_DT >= GETDATE()
AND A.EFFDT = (
    SELECT MAX(EFFDT)
    FROM PS_CUST_CONTACT CUST_CONTACT
    WHERE CUST_CONTACT.SETID = A.SETID
        AND CUST_CONTACT.CUST_ID = A.CUST_ID
        AND CUST_CONTACT.CNTCT_SEQ_NUM = A.CNTCT_SEQ_NUM
        AND CUST_CONTACT.EFFDT <= { FN CURDATE() }
    )
AND A.EFF_STATUS = 'A'
AND B.PERSON_ID IN (
    SELECT A1.PERSON_ID
    FROM PS_CONTACT A1
        ,PS_CONTACT_CUST B1
    WHERE A1.EFFDT = (
            SELECT MAX(A_ED.EFFDT)
            FROM PS_CONTACT A_ED
            WHERE A1.SETID = A_ED.SETID
                AND A1.CONTACT_ID = A_ED.CONTACT_ID
                AND A_ED.EFFDT <= SUBSTRING(CONVERT(CHAR, GETDATE(), 121), 1, 10)
            )
        AND A1.SETID = B1.SETID
        AND A1.CONTACT_ID = B1.CONTACT_ID
        AND B1.EFFDT = (
            SELECT MAX(B_ED.EFFDT)
            FROM PS_CONTACT_CUST B_ED
            WHERE B1.SETID = B_ED.SETID
                AND B1.CONTACT_ID = B_ED.CONTACT_ID
                AND B_ED.EFFDT <= A.EFFDT
            )
        AND A.CNTCT_SEQ_NUM = B1.CNTCT_SEQ_NUM
        AND A.SETID = B1.CUSTOMER_SETID
        AND A.CUST_ID = B1.CUST_ID
    )

enter image description here

Solução

To flesh out what Aaron states a little more, linked server performance, particularly for large result sets, for cross-server joins and cross-server subqueries, is often disappointing. If you watch the remote server with Profiler, you may find that the local server bombards the remote server with requests for the fetch of a single row to match join columns. When that happens, the network latency and calling overhead conspire to kill performance.

If you can query local data without too much trouble, that would be best. You might be able to restore a production backup or use SSIS (or even bcp, it still works) to copy data from the production server to some working tables on the local server. Generally SSIS, bcp and similar tactics are faster than linked servers and may help to avoid issues with log file growth.

If you must query the data from a remote server, you may find that rewriting the query so that it uses OPENQUERY() (rather than four-part names) and 'sends' all of the 'remote parts' of the query over to the remote server and then joins the results of that to the local data will be more effective. SQL is supposed to be smart enough to move all of the joins to the remote server, but sometimes it doesn't and OPENQUERY() gives you a method to force SQL to do want you want.

Another, similar, tactic would be to run the 'remote part' of the query first and put the results into a temporary table, (optionally) index the temporary table and then join the 'local part' of the query to the temporary table. Again, this helps you to force SQL to do what it ought to.

It sounds like more work, but SQL may be able to behave more efficiently. As always, watch your data types on the joins and your SARGs or your indexes will be ignored.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a dba.stackexchange