Question

I have a query which runs little slower.

SELECT b.BreakdownClassificationId,
       k.IsinCode,
       k.SedolCode,
       ClassificationDate,
       NAME,
       InstrumentType,
       GeographicalLocation,
       CapSize,
       Currency,
       ExchangeName,
       HoldingDomicile,
       MaturityDate,
       Sector,
       MajorSector
FROM   #BreakdownSet b
       OUTER apply (SELECT TOP 1 IsinCode,
                                 SedolCode,
                                 ClassificationDate,
                                 NAME,
                                 InstrumentType,
                                 GeographicalLocation,
                                 CapSize CapSize,
                                 Currency,
                                 ExchangeName,
                                 HoldingDomicile,
                                 MaturityDate,
                                 Sector,
                                 MajorSector
                    FROM   dbfinex.dbo.PfPortfolioHoldingClassificationFtid x WITH (nolock)
                    WHERE  ( x.isincode > ''
                             AND x.isincode = b.breakdowncode )
                            OR ( x.sedolcode > ''
                                 AND x.sedolcode = b.breakdowncode )
                            OR ( x.sedolcode > ''
                                 AND x.sedolcode = b.sedolcode )
                            OR ( x.isincode > ''
                                 AND x.isincode = b.isincode )
                    ORDER  BY CASE
                                WHEN x.sedolcode = b.breakdowncode THEN 1
                                WHEN x.isincode = b.breakdowncode THEN 2
                                WHEN x.sedolcode = b.sedolcode THEN 3
                                WHEN x.isincode = b.isincode THEN 4
                                ELSE 5
                              END,
                              classificationdate DESC) k 

Execution Plan

enter image description here

Order By inside the Cross Apply is very costly, is there any better way to write this query ?

Was it helpful?

Solution

You can eliminate the sort if you like, although it's hard to say if that will necessary improve query performance. The key is how you constructed your ORDER BY clause along with the search conditions. If there's any row that matches x.sedolcode = b.breakdowncode then you want to take that row, otherwise you go to the next condition. With the right indexes we can avoid the sort by splitting up the APPLY. The SQL Server query optimizer actually gives a nice hint because it transforms your OR conditions into UNIONs.

I'm going to create a limited example in order to show the general query plan shape that I'm after. I will also assume that the PfPortfolioHoldingClassificationFtid table has a primary key and a clustering key on a PK column. Here is my test data:

CREATE TABLE #BreakdownSet (
    BreakdownClassificationId BIGINT NOT NULL,
    breakdowncode VARCHAR(10) NULL,
    sedolcode VARCHAR(10) NULL,
    isincode VARCHAR(10) NULL
);

INSERT INTO #BreakdownSet
SELECT 
  t.RN
, CASE WHEN RN % 10 = 1 THEN t.RN ELSE NULL END
, CASE WHEN RN % 10 = 4 THEN t.RN ELSE NULL END
, CASE WHEN RN % 10 = 7 THEN t.RN ELSE NULL END
FROM
(
    SELECT TOP (1500000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) RN
    FROM master..spt_values t1
    CROSS JOIN master..spt_values t2
) t;

CREATE TABLE dbo.PfPortfolioHoldingClassificationFtid (
    PK BIGINT NOT NULL,
    isincode VARCHAR(10) NOT NULL,
    sedolcode VARCHAR(10) NOT NULL,
    ClassificationDate DATE NOT NULL,
    OTHER_COLUMN VARCHAR(200) NOT NULL,
    PRIMARY KEY (PK)
);

INSERT INTO dbo.PfPortfolioHoldingClassificationFtid WITH (TABLOCK)
SELECT 
  t.RN
, t.RN
, t.RN
, DATEADD(DAY, t.rn / 100, '20170101')
, REPLICATE('OTHER', 40)
FROM
(
    SELECT TOP (1500000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) RN
    FROM master..spt_values t1
    CROSS JOIN master..spt_values t2
) t;

CREATE INDEX isin_date ON dbo.PfPortfolioHoldingClassificationFtid (isincode, ClassificationDate);

CREATE INDEX sedol_date ON dbo.PfPortfolioHoldingClassificationFtid (sedolcode, ClassificationDate);

Here is your query with my table definitions:

SELECT b.breakdownclassificationid, 
       k.isincode, 
       k.sedolcode, 
       classificationdate, 
       other_column 
FROM   #breakdownset b 
       OUTER apply (SELECT TOP 1 isincode, 
                                 sedolcode, 
                                 classificationdate, 
                                 other_column 
                    FROM   dbo.pfportfolioholdingclassificationftid x WITH ( 
                           nolock) 
                    WHERE  ( x.isincode > '' 
                             AND x.isincode = b.breakdowncode ) 
                            OR ( x.sedolcode > '' 
                                 AND x.sedolcode = b.breakdowncode ) 
                            OR ( x.sedolcode > '' 
                                 AND x.sedolcode = b.sedolcode ) 
                            OR ( x.isincode > '' 
                                 AND x.isincode = b.isincode ) 
                    ORDER  BY CASE 
                                WHEN x.sedolcode = b.breakdowncode THEN 1 
                                WHEN x.isincode = b.breakdowncode THEN 2 
                                WHEN x.sedolcode = b.sedolcode THEN 3 
                                WHEN x.isincode = b.isincode THEN 4 
                                ELSE 5 
                              END, 
                              classificationdate DESC) k; 

Not surprisingly I get a different plan than you, but the sort still has a high estimated cost. What if we split the APPLY into four parts and have each APPLY only return the primary key of the table? If we have a covering index for each APPLY then we can find the primary key of the matching row with at most four index seeks. No sorting is required. We can also skip doing the seeks that we don't need by adding filters into the APPLYs but that isn't guaranteed behavior. Here's one way to write it:

SELECT b.breakdownclassificationid, 
       k.isincode, 
       k.sedolcode, 
       classificationdate, 
       other_column 
FROM   #breakdownset b 
       OUTER apply (SELECT TOP 1 pk 
                    FROM   dbo.pfportfolioholdingclassificationftid x 
                    WHERE  x.sedolcode = b.breakdowncode 
                    ORDER  BY classificationdate DESC) a1 
       OUTER apply (SELECT TOP 1 pk 
                    FROM   dbo.pfportfolioholdingclassificationftid x 
                    WHERE  x.isincode = b.breakdowncode 
                           AND a1.pk IS NOT NULL 
                    ORDER  BY classificationdate DESC) a2 
       OUTER apply (SELECT TOP 1 pk 
                    FROM   dbo.pfportfolioholdingclassificationftid x 
                    WHERE  x.sedolcode = b.sedolcode 
                           AND a2.pk IS NOT NULL 
                    ORDER  BY classificationdate DESC) a3 
       OUTER apply (SELECT TOP 1 pk 
                    FROM   dbo.pfportfolioholdingclassificationftid x 
                    WHERE  x.isincode = b.isincode 
                           AND a3.pk IS NOT NULL 
                    ORDER  BY classificationdate DESC) a4 
       LEFT OUTER JOIN dbo.pfportfolioholdingclassificationftid k 
                    ON k.pk = COALESCE(a1.pk, a2.pk, a3.pk, a4.pk);

With 1.5 M rows in the temp table, the worst possible case should be 6 million nonclustered index seeks and 1.5 million clustered index seeks.

The query runs in two seconds on my machine. The runtime doesn't matter because I have nonsense data. However, there isn't a sort in the plan. I uploaded the query plan to Paste The Plan. You should consider doing that as well for future questions. Here's also a screenshot of the actual plan:

actual plan hash

The query that you have does a nested loop join and doesn't go parallel. With LOOP JOIN and MAXDOP 1 hints the query finishes in seven seconds on my machine. Here's the plan and here's a screenshot:

enter image description here

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top