Question

I just saw this function definition:

create function dbo.f (@a int,  @b int)  
returns integer
as  
begin 
return  case when 
        not exists (Select * from t1 where t1.col1 = @a)
        AND @b > 0
        then 1 else 0 end
end
GO

Seeing a not exists I thought attention full table scan and tried to improve it

create function dbo.f (@a int,  @b int)  
returns integer
as  
begin 
return  case when 
        exists (Select * from t1 where t1.col1 = @a)
        OR @b > 0
        then 0 else 1 end
end

GO

My feeling is, that this transformation could have been done by an optimizer. It seems to be straight forward, but how can I be sure if he does?

Comment on Igor's answer: (comparison fixed thanks to Matts comment)

This inspires me to the following:

create function dbo.f (@a int,  @b int)  
returns integer
as  
begin 
 IF @b <= 0
   RETURN 0

 IF exists (Select * from t1 where t1.col1 = @a)
   RETURN 0

  RETURN 1
end
GO
Was it helpful?

Solution

One thing you could do to compare these two versions is to rename one of them to f_old and execute them both inside the same batch, while including the actual execution plan and IO statistics. From there, it is a simple matter of comparing the plans and IO stats of each.

One thing you need to be wary of is that in your first statement you had IF @b > 0. In your second statement you have IF @b < 0. What happens if @b = 0? You have not covered that possibility.

Finally, I imagine you will find that your thoughts regarding NOT EXISTS vs. EXISTS is correct. EXISTS will process until the condition is met. NOT EXISTS requires a table scan (if not indexed) and an index scan (if indexed). On a small table, you will not see a huge performance gain, but it is always best to design your logic around EXISTS, as opposed to NOT EXISTS.

Matt

OTHER TIPS

I'm pretty sure SQL server would calculate the Select * from t1 ... first, and only then check to see whether the result exists or not. If that's true, then both versions should be essentially identical speedwise. I wonder if you could test this by running these functions on two large tables, and see if there is a consistent different in the time it takes for the functions to return?

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top