Question

The question can be specific to SQL server. When I write a query such as :

SELECT * FROM IndustryData WHERE Date='20131231'
AND ReportTypeID = CASE WHEN (fnQuarterDate('20131231')='20131231') THEN  1 
                        WHEN (fnQuarterDate('20131231')!='20131231') THEN  4
                        END;

Does the Function Call fnQuarterDate (or any Subquery) within Case inside a Where clause is executed for EACH row of the table ?

How would it be better if I get the function's (or any subquery) value beforehand inside a variable like:

DECLARE @X INT
IF fnQuarterDate('20131231')='20131231'
SET @X=1 
ELSE
SET @X=0
SELECT * FROM IndustryData WHERE Date='20131231'
AND ReportTypeID = CASE WHEN (@X = 1) THEN  1 
                        WHEN (@X = 0) THEN  4
                        END;

I know that in MySQL if there is a subquery inside IN(..) within a WHERE clause, it is executed for each row, I just wanted to find out the same for SQL SERVER.

...

Just populated table with about 30K rows and found out the Time Difference:

Query1= 70ms ; Query 2= 6ms. I think that explains it but still don't know the actual facts behind it.

Also would there be any difference if instead of a UDF there was a simple subquery ?

Was it helpful?

Solution

I think the solution may in theory help you increase the performance, but it also depends on what the scalar function actually does. I think that in this case (my guess is formatting the date to last day in the quarter) would really be negligible.

You may want to read this page with suggested workarounds:

http://connect.microsoft.com/SQLServer/feedback/details/273443/the-scalar-expression-function-would-speed-performance-while-keeping-the-benefits-of-functions#

Because SQL Server must execute each function on every row, using any function incurs a cursor like performance penalty.

And in Workarounds, there is a comment that

I had the same problem when I used scalar UDF in join column, the performance was horrible. After I replaced the UDF with temp table that contains the results of UDF and used it in join clause, the performance was order of magnitudes better. MS team should fix UDF's to be more reliable.

So it appears that yes, this may increase the performance.

Your solution is correct, but I would recommend considering an improvement of the SQL to use ELSE instead, it looks cleaner to me:

AND ReportTypeID = CASE WHEN (@X = 1) THEN  1 
                    ELSE 4
                    END;

OTHER TIPS

It depends. See User-Defined Functions:

The number of times that a function specified in a query is actually executed can vary between execution plans built by the optimizer. An example is a function invoked by a subquery in a WHERE clause. The number of times the subquery and its function is executed can vary with different access paths chosen by the optimizer.

This approach uses in-line MySQL variables... The query alias of "sqlvars" will prepare the @dateBasis first with the date in question, then a second variable @qtrReportType based on the function call done ONCE for the entire query. Then, by cross-join (via no where clause between the tables since the sqlvars is considered a single row anyhow), will use those values to get data from your IndustryData table.

select
      ID.*
   from
      ( select 
              @dateBasis := '20131231',
              @qtrReportType := case when fnQuarterDate(@dateBasis) = @dateBasis 
                                then 1 else 4 end ) sqlvars,
      IndustryData ID
   where
          ID.Date = @dateBasis
      AND ID.ReportTypeID = @qtrReportType
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top