SQL stored proc performance - huge WHERE clause

https://stackoverflow.com/questions/4710745

11-10-2019
|

Question

I have a stored proc that runs a query with a HUGE where clause. In itself, the WHERE clause is very simple. It looks like

SELECT a, b, c FROM table
WHERE (cond1) OR (cond2) OR (cond3) OR (cond4)

where cond1, cond2, cond3 and cond4 all represent some requirement from our users.

My question is regarding the query performance: would it make sense to execute 4 separeate queries (each with one of the conditions cond{1..4}), insert the results into a temporary table, and then finally select everything from that temporary table?

What I'm wondering is, whether dbms' optimize for such situations.

FWIW, i'm using Syabse ASE - TDS 5.5.

Thanks Harshath

PS: Please don't ask me to "do my own benchmarking". I will of couse be doing that eventually. What i'm really looking for links pointing to the internals of such optimizations, if any. TY :)

Solution

The reason split-up queries can be faster is that the smaller separate queries can be resolved using indexes, where the big query results in a table scan.

The first question is: does the table have indexes? If not, a table scan will always be required, and splitting the query into N parts will just cause N table scans.

If there are indexes, you'd have to test if your the optimizer decides to use them. If the query optimizer already uses them, there is no point in splitting the query.

If there are indexes, and the query optimizer does not use them, and your testing shows that it would be faster to use them, you can benefit from splitting the query.

OTHER TIPS

If you include everything in a single WHERE clause, at least the DBMS will have an opportunity to optimize it. If you use separate queries, then the DBMS won't be able to optimize.

Besides, it seems to me you are adding a bunch of overhead with the temporary table, multiple queries, and combining everything together. I can't imagine how your second option could ever be faster or even as fast.

Having a larger WHERE will almost always be faster than separating it out into multiple queries and then appending the results.

Consider that in multiple queries are you are scanning the table for each query - which adds considerable overhead, not to mention any kind of disk IO that might happen. It's best to compare all you need to compare while you have the data in memory, than risk losing it from memory only to have to pull it back into memory for the next query.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow