When/Why to use Cascading in SQL Server?

https://stackoverflow.com/questions/59297

09-06-2019
|

Question

When setting up foreign keys in SQL Server, under what circumstances should you have it cascade on delete or update, and what is the reasoning behind it?

This probably applies to other databases as well.

I'm looking most of all for concrete examples of each scenario, preferably from someone who has used them successfully.

Solution

Summary of what I've seen so far:

Some people don't like cascading at all.

Cascade Delete

Cascade Delete may make sense when the semantics of the relationship can involve an exclusive "is part of" description. For example, an OrderLine record is part of its parent order, and OrderLines will never be shared between multiple orders. If the Order were to vanish, the OrderLine should as well, and a line without an Order would be a problem.
The canonical example for Cascade Delete is SomeObject and SomeObjectItems, where it doesn't make any sense for an items record to ever exist without a corresponding main record.
You should not use Cascade Delete if you are preserving history or using a "soft/logical delete" where you only set a deleted bit column to 1/true.

Cascade Update

Cascade Update may make sense when you use a real key rather than a surrogate key (identity/autoincrement column) across tables.
The canonical example for Cascade Update is when you have a mutable foreign key, like a username that can be changed.
You should not use Cascade Update with keys that are Identity/autoincrement columns.
Cascade Update is best used in conjunction with a unique constraint.

When To Use Cascading

You may want to get an extra strong confirmation back from the user before allowing an operation to cascade, but it depends on your application.
Cascading can get you into trouble if you set up your foreign keys wrong. But you should be okay if you do that right.
It's not wise to use cascading before you understand it thoroughly. However, it is a useful feature and therefore worth taking the time to understand.

OTHER TIPS

Foreign keys are the best way to ensure referential integrity of a database. Avoiding cascades due to being magic is like writing everything in assembly because you don't trust the magic behind compilers.

What is bad is the wrong use of foreign keys, like creating them backwards, for example.

Juan Manuel's example is the canonical example, if you use code there are many more chances of leaving spurious DocumentItems in the database that will come and bite you.

Cascading updates are useful, for instance, when you have references to the data by something that can change, say a primary key of a users table is the name,lastname combination. Then you want changes in that combination to propagate to wherever they are referenced.

@Aidan, That clarity you refer to comes at a high cost, the chance of leaving spurious data in your database, which is not small. To me, it's usually just lack of familiarity with the DB and inability to find which FKs are in place before working with the DB that foster that fear. Either that, or constant misuse of cascade, using it where the entities were not conceptually related, or where you have to preserve history.

I never use cascading deletes.

If I want something removed from the database I want to explicitly tell the database what I want taking out.

Of course they are a function available in the database and there may be times when it is okay to use them, for example if you have an 'order' table and an 'orderItem' table you may want to clear the items when you delete an order.

I like the clarity that I get from doing it in code (or stored procedure) rather than 'magic' happening.

For the same reason I am not a fan of triggers either.

Something to notice is that if you do delete an 'order' you will get '1 row affected' report back even if the cascaded delete has removed 50 'orderItem's.

I work a lot with cascading deletes.

It feels good to know whoever works against the database might never leave any unwanted data. If dependencies grow I just change the constraints in the diagramm in Management Studio and I dont have to tweak sp or dataacces.

That said, I have 1 problem with cascading deletes and thats circular references. This often leads to parts of the database that have no cascading deletes.

I do a lot of database work and rarely find cascade deletes useful. The one time I have used them effectively is in a reporting database that is updated by a nightly job. I make sure that any changed data is imported correctly by deleting any top level records that have changed since the last import, then reimport the modified records and anything that relates to them. It save me from having to write a lot of complicated deletes that look from the bottom to the top of my database.

I don't consider cascade deletes to be quite as bad as triggers as they only delete data, triggers can have all kinds of nasty stuff inside.

In general I avoid real Deletes altogether and use logical deletes (ie. having a bit column called isDeleted that gets set to true) instead.

One example is when you have dependencies between entities... ie: Document -> DocumentItems (when you delete Document, DocumentItems don't have a reason to exist)

Use cascade delete where you would want the record with the FK to be removed if its referring PK record was removed. In other words, where the record is meaningless without the referencing record.

I find cascade delete useful to ensure that dead references are removed by default rather than cause null exceptions.

ON Delete Cascade:

When you want rows in child table to be deleted If corresponding row is deleted in parent table.

If on cascade delete isn't used then an error will be raised for referential integrity.

ON Update Cascade:

When you want change in primary key to be updated in foreign key

One reason to put in a cascade delete (rather than doing it in the code) is to improve performance.

Case 1: With a cascade delete

 DELETE FROM table WHERE SomeDate < 7 years ago;

Case 2: Without a cascade delete

 FOR EACH R IN (SELECT FROM table WHERE SomeDate < 7 years ago) LOOP
   DELETE FROM ChildTable WHERE tableId = R.tableId;
   DELETE FROM table WHERE tableId = R.tableid;
   /* More child tables here */
 NEXT

Secondly, when you add in an extra child table with a cascade delete, the code in Case 1 keeps working.

I would only put in a cascade where the semantics of the relationship is "part of". Otherwise some idiot will delete half of your database when you do:

DELETE FROM CURRENCY WHERE CurrencyCode = 'USD'

I have heard of DBAs and/or "Company Policy" that prohibit using "On Delete Cascade" (and others) purely because of bad experiences in the past. In one case a guy wrote three triggers which ended up calling one another. Three days to recover resulted in a total ban on triggers, all because of the actions of one idjit.

Of course sometimes Triggers are needed instead of "On Delete cascade", like when some child data needs to be preserved. But in other cases, its perfectly valid to use the On Delete cascade method. A key advantage of "On Delete cascade" is that it captures ALL the children; a custom written trigger/store procedure may not if it is not coded correctly.

I believe the Developer should be allowed to make the decision based upon what the development is and what the spec says. A carpet ban based on a bad experience should not be the criteria; the "Never use" thought process is draconian at best. A judgement call needs to be made each and every time, and changes made as the business model changes.

Isn't this what development is all about?

I try to avoid deletes or updates that I didn't explicitly request in SQL server.

Either through cascading or through the use of triggers. They tend to bite you in the ass some time down the line, either when trying to track down a bug or when diagnosing performance problems.

Where I would use them is in guaranteeing consistency for not very much effort. To get the same effect you would have to use stored procedures.

I, like everyone else here, find that cascade deletes are really only marginally helpful (it's really not that much work to delete referenced data in other tables -- if there are lot of tables, you simply automate this with a script) but really annoying when someone accidentally cascade deletes some important data that is difficult to restore.

The only case where I'd use is if the data in the table table is highly controlled (e.g., limited permissions) and only updated or deleted from through a controlled process (like a software update) that has been verified.

A deletion or update to S that removes a foreign-key value found in some tuples of R can be handled in one of three ways:

Rejection
Propagation
nullification.

Propagation is referred to as cascading.

There are two cases:

‣ If a tuple in S was deleted, delete the R tuples that referred to it.

‣ If a tuple in S was updated, update the value in the R tuples that refer to it.

If you're working on a system with many different modules in different versions, it can be very helpful, if the cascade deleted items are part of / owned by the PK holder. Else, all modules would require immediate patches to clean up their dependent items before deleting the PK owner, or the foreign key relation would be omitted completely, possibly leaving tons of garbage in the system if cleanup is not performed correctly.

I just introduced cascade delete for a new intersection table between two already existing tables (the intersection to delete only), after cascade delete had been discouraged from for quite some time. It's also not too bad if data gets lost.

It is, however, a bad thing on enum-like list tables: somebody deletes entry 13 - yellow from table "colors", and all yellow items in the database get deleted. Also, these sometimes get updated in a delete-all-insert-all manner, leading to referential integrity totally omitted. Of course it's wrong, but how will you change a complex software which has been running for many years, with introduction of true referential integrity being at risk of unexpected side effects?

Another problem is when original foreign key values shall be kept even after the primary key has been deleted. One can create a tombstone column and an ON DELETE SET NULL option for the original FK, but this again requires triggers or specific code to maintain the redundant (except after PK deletion) key value.

Cascade deletes are extremely useful when implementing logical super-type and sub-type entities in a physical database.

When separate super-type and sub-type tables are are used to physically implement super-types/sub-types (as opposed to rolling up all sub-type attributes into a single physical super-type table), there is a one-to-one relationship between these tables and the issue then becomes how to keep the primary keys 100% in sync between these tables.

Cascade deletes can be a very useful tool to:

1) Make sure that deleting a super-type record also deletes the corresponding single sub-type record.

2) Make sure that any delete of a sub-type record also deletes the super-type record. This is achieved by implementing an "instead-of" delete trigger on the sub-type table that goes and deletes the corresponding super-type record, which, in turn, cascade deletes the sub-type record.

Using cascade deletes in this manner ensures that no orphan super-type or sub-type records ever exist, regardless of whether you delete the super-type record first or the sub-type record first.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow