Question

Should I use a separate CROSS APPLY for each aliased expression or define multiple expressions in the same CROSS APPLY where possible?

I am refactoring many very complicated SQL queries (often four or five pages long with a dozen JOINs) that are generated in code. Often they are nested five or six queries deep, as some parts are reused in different situations. A typical reason for nesting the queries is that one query extracts some values, performs a computation, and assigns an alias to the value. Then the enclosing query makes further computations using the aliased value, etc. This is necessary because an aliased expression can only be referenced in an outer query, not elsewhere in the query that defines it. (The alternative, which is also present in the code, is to repeat the same lengthy subexpression a dozen times throughout the query.)

To simplify this spaghetti-SQL, I started to collapse these telescoping queries using CROSS APPLY. By using CROSS APPLY, I can assign an alias and use that elsewhere in the same query. Note that none of my CROSS APPLY clauses introduce new tables via subselects, or call UDFs.

At first, I put many expressions in the same CROSS APPLY. However, my code needs to be modular. Different fields will be needed in different situations, since users can select fields for filters, and we generate similar statistics for different columns, hence create many similar queries that differ only a little from each other. Since some fields build upon others, and some depend upon certain JOINs being defined earlier in the query, I cannot put everything in the same CROSS APPLY. Consequently, the number of CROSS APPLYs may vary.

Furthermore, some fields that are extracted from are shadowed by the new aliases and redefined (typically to remove NULLs, supply defaults, etc.) Thus when the CROSS APPLY defined field is referenced elsewhere, the CROSS APPLY's table alias will need to be specified to remove ambiguity.

The upshot of this is, I need a logical way to name my CROSS APPLYs, and I care about performance. If the number of CROSS APPLYs changes, then if the CROSS APPLYs are numbered with a simple counter (e.g. computed1, computed2, etc.) then a given field may see its name reference change from computed3.[FIELD NAME] to computed4.[FIELD NAME] which has a ripple effect throughout the rest of the query, which may have been assembled by calls to different C# procedures. This is a maintenance problem.

The alternative that I am considering is putting each computed expression in a separate CROSS APPLY, and the name of the CROSS APPLY will be derived from the alias for the expression. For example:

Using one CROSS APPLY:

CROSS APPLY (
    SELECT
          [TIV_BLDG] = (CASE [COVERAGE] WHEN 'TIV_BLDG' THEN VALUEAMT ELSE 0 END)

        , [TIV_OSTR] = (CASE [COVERAGE] WHEN 'TIV_OSTR' THEN VALUEAMT ELSE 0 END)

        , [TIV_CONT] = (CASE [COVERAGE] WHEN 'TIV_CONT' THEN VALUEAMT ELSE 0 END)

        , [TIV_TIME] = (CASE [COVERAGE] WHEN 'TIV_TIME' THEN VALUEAMT ELSE 0 END)

) computed2

Using four CROSS APPLYs:

CROSS APPLY (SELECT [TIV_BLDG] = (CASE [COVERAGE] WHEN 'TIV_BLDG' THEN VALUEAMT ELSE 0 END)) AS [TIV_BLDG_COMP]

CROSS APPLY (SELECT [TIV_OSTR] = (CASE [COVERAGE] WHEN 'TIV_OSTR' THEN VALUEAMT ELSE 0 END)) AS [TIV_OSTR_COMP]

CROSS APPLY (SELECT [TIV_CONT] = (CASE [COVERAGE] WHEN 'TIV_CONT' THEN VALUEAMT ELSE 0 END)) AS [TIV_CONT_COMP]

CROSS APPLY (SELECT [TIV_TIME] = (CASE [COVERAGE] WHEN 'TIV_TIME' THEN VALUEAMT ELSE 0 END)) AS [TIV_TIME_COMP]

If I use this multiple CROSS APPLY approach, I can easily generate names that won't change if I add new applies or remove them, and I can predict in the code that references the fields what the table alias will be. However, this means many more CROSS APPLY statements. What are the performance implications? Is there a better way to do this, using CROSS APPLY or some other SQL Server feature? (I do not need this to be cross database, so Microsoft-only features are fine.)

Was it helpful?

Solution

If there is a performance disadvantage it must be in the compilation stage of query execution. As execution plans are cached the second the time query executes there will be no performance penalty.

SQL Server can see through the CROSS APPLYs and convert them to simple, stacked "compute scalars".

Another potential problem might be the size of the query. Not sure if there are any relevant limitations in terms of number of AST-nodes or joins or so (I guess CROSS APPLY counts as a join when it comes to internal limits).

Apart from these problems I think that CROSS APPLYs are a very nice and fast solution.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top