When are procedural queries absolutely necessary?

https://dba.stackexchange.com/questions/4610

16-10-2019
|

Question

I know that we tend to avoid cursors and loops within SQL Server at every cost, but what are some of the situations where you absolutely need procedural queries, and set-based queries just will not give you the results?

I understand the difference between the two, I just have never come to a situation where I need to use a cursor. I'm wondering if there are such situations.

Solution

In my experiences, I've run into a few times when procedural/iterative approaches were warranted.

API only allows for single-row operation

If I wanted to programmatically change the data type from real to decimal in a table that has 500 mis-typed columns like this SO question asks, a cursor is a fine approach as the DDL does not allow for altering multiple columns in a single statement.

Set based doesn't scale

If you have the SQL Server MVP Deep Dives book, chapter 4 "Set-based iteration: the third alternative" by Hugo Kornelis has some great use cases for combined cursor/set based operations. Two of classic problems that the chapter author references is the Running Totals and Bin Packing.

I used the set-based iteration approach with good success for a poorly designed process I inherited at the last job. In short, there was a process that once a year had to update 50-75M rows and attempting to do so in a single set would blow our logs. By chunking the updates into smaller batches of N rows, it allowed the log to keep up and actually finished faster than the previous year when they just allocated a metric ton more disk space.

OTHER TIPS

When something can't be done set based.

Bleeding obvious of course. But note there is difference in "not set based" and folk using a procedural solution because they don't understand sets or don't know how to do it with set based code.

One example of procedural code would be sending an email per row that has different content per row

A lot of SQL code for DBA use is procedural. For example, looping (CURSOR or WHILE: no difference) over databases and tables to rebuild indexes and update statistics.

Some SQL constructs allow row-by-row processing in the context of a set, such as CROSS APPLY like this on SO: SELECT TOP 5 rows for each FK (note the ROW_NUMBER() solution too though)

Edit: extending @billinkc's answer...

CROSS APPLY allows set based operations with UDFs that have a "single row API"

I know you are asking about SQL Server, but in the Oracle world (in the past), temporary tables had a very high cost, so cursor based procedures and triggers were quicker and lower "cost" to the server. In SQL Server, cursors used to have far higher cost than temp tables, so writing cursor based code was discouraged. I'm pretty sure these discrepancies have been eliminated in the past decade.

To cope with these situations, most people have a general rule to avoid putting business logic into the database. If you can absolutely totally always do that, then there won't be any reason for procedural logic in neither T-SQL nor PL/SQL. Relational databases are great at set-based logic. Most modern programming languages are great at procedural logic. It is best to use each one for what they are good at.

Some auditing triggers that I've worked with had rather complicated rules for what had to be checked, and where things had to be updated/logged. Some were for keeping reporting systems in sync with transactional systems (it wasn't my choice, but they wanted it that way). Some were for a formulary system. A formulary is a list of drugs, and for each insurance company, what they will/won't cover, and if prescribed drug_X what replacements are covered by insurance. It was also common for different group policies at the same insurance company to pay for different drugs.

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange