Question

As far as I know, foreign keys (FK) are used to aid the programmer to manipulate data in the correct way. Suppose a programmer is actually doing this in the right manner already, then do we really need the concept of foreign keys?

Are there any other uses for foreign keys? Am I missing something here?

Was it helpful?

Solution

Foreign keys help enforce referential integrity at the data level. They also improve performance because they're normally indexed by default.

OTHER TIPS

Foreign keys can also help the programmer write less code using things like ON DELETE CASCADE. This means that if you have one table containing users and another containing orders or something, then deleting a user could automatically delete all orders that point to that user.

I can't imagine designing a database without foreign keys. Without them, eventually you are bound to make a mistake and corrupt the integrity of your data.

They are not required, strictly speaking, but the benefits are huge.

I'm fairly certain that FogBugz does not have foreign key constraints in the database. I would be interested to hear how the Fog Creek Software team structures their code to guarantee that they will never introduce an inconsistency.

A database schema without FK constraints is like driving without a seat belt.

One day, you'll regret it. Not spending that little extra time on the design fundamentals and data integrity is a sure fire way of assuring headaches later.

Would you accept code in your application that was that sloppy? That directly accessed the member objects and modified the data structures directly.

Why do you think this has been made hard and even unacceptable within modern languages?

Yes.

  1. They keep you honest
  2. They keep new developers honest
  3. You can do ON DELETE CASCADE
  4. They help you to generate nice diagrams that self explain the links between tables

Personally, I am in favor of foreign keys because it formalizes the relationship between the tables. I realize that your question presupposes that the programmer is not introducing data that would violate referential integrity, but I have seen way too many instances where data referential integrity is violated, despite best intentions!

Pre-foreign key constraints (aka declarative referential integrity or DRI) lots of time was spent implementing these relationships using triggers. The fact that we can formalize the relationship by a declarative constraint is very powerful.

@John - Other databases may automatically create indexes for foreign keys, but SQL Server does not. In SQL Server, foreign key relationships are only constraints. You must defined your index on foreign keys separately (which can be of benefit.)

Edit: I'd like to add that, IMO, the use of foreign keys in support of ON DELETE or ON UPDATE CASCADE is not necessarily a good thing. In practice, I have found that cascade on delete should be carefully considered based on the relationship of the data -- e.g. do you have a natural parent-child where this may be OK or is the related table a set of lookup values. Using cascaded updates implies you are allowing the primary key of one table to be modified. In that case, I have a general philosophical disagreement in that the primary key of a table should not change. Keys should be inherently constant.

Suppose a programmer is actually doing this in the right manner already

Making such a supposition seems to me to be an extremely bad idea; in general software is phenomenally buggy.

And that's the point, really. Developers can't get things right, so ensuring the database can't be filled with bad data is a Good Thing.

Although in an ideal world, natural joins would use relationships (i.e. FK constraints) rather than matching column names. This would make FKs even more useful.

Without a foreign key how do you tell that two records in different tables are related?

I think what you are referring to is referential integrity, where the child record is not allowed to be created without an existing parent record etc. These are often known as foreign key constraints - but are not to be confused with the existence of foreign keys in the first place.

Is there a benefit to not having foreign keys? Unless you are using a crappy database, FKs aren't that hard to set up. So why would you have a policy of avoiding them? It's one thing to have a naming convention that says a column references another, it's another to know the database is actually verifying that relationship for you.

I suppose you are talking about foreign key constraints enforced by the database. You probably already are using foreign keys, you just haven't told the database about it.

Suppose a programmer is actually doing this in the right manner already, then do we really need the concept of foreign keys?

Theoretically, no. However, there have never been a piece of software without bugs.

Bugs in application code are typically not that dangerous - you identify the bug and fix it, and after that the application runs smoothly again. But if a bug allows currupt data to enter the database, then you are stuck with it! It's very hard to recover from corrupt data in the database.

Consider if a subtle bug in FogBugz allowed a corrupt foreign key to be written in the database. It might be easy to fix the bug and quickly push the fix to customers in a bugfix release. However, how should the corrupt data in dozens of databases be fixed? Correct code might now suddenly break because the assumptions about the integrity of foreign keys dont hold anymore.

In web applications you typically only have one program speaking to the database, so there is only one place where bugs can corrupt the data. In an enterprise application there might be several independent applications speaking to the same database (not to mention people working directly with the database shell). There is no way to be sure that all applications follow the same assumptions without bugs, always and forever.

If constraints are encoded in the database, then the worst that can happen with bugs is that the user is shown an ugly error message about some SQL constraint not satisfied. This is much prefereable to letting currupt data into your enterprise database, where it in turn will break all your applications or just lead to all kinds of wrong or misleading output.

Oh, and foreign key constraints also improves performance because they are indexed by default. I can't think of any reason not to use foreign key constraints.

FKs are very important and should always exist in your schema, unless you are eBay.

I think some single thing at some point must be responsible for ensuring valid relationships.

For example, Ruby on Rails does not use foreign keys, but it validates all the relationships itself. If you only ever access your database from that Ruby on Rails application, this is fine.

However, if you have other clients which are writing to the database, then without foreign keys they need to implement their own validation. You then have two copies of the validation code which are most likely different, which any programmer should be able to tell is a cardinal sin.

At that point, foreign keys really are neccessary, as they allow you to move the responsibility to a single point again.

Foreign keys allow someone who has not seen your database before to determine the relationship between tables.

Everything may be fine now, but think what will happen when your programmer leaves and someone else has to take over.

Foreign keys will allow them to understand the database structure without trawling through thousand of lines of code.

As far as I know, foreign keys are used to aid the programmer to manipulate data in the correct way.

FKs allow the DBA to protect data integrity from the fumbling of users when the programmer fails to do so, and sometimes to protect against the fumbling of programmers.

Suppose a programmer is actually doing this in the right manner already, then do we really need the concept of foreign keys?

Programmers are mortal and fallible. FKs are declarative which makes them harder to screw up.

Are there any other uses for foreign keys? Am I missing something here?

Although this is not why they were created, FKs provide strong reliable hinting to diagramming tools and to query builders. This is passed on to end users, who desperately need strong reliable hints.

They are not strictly necessary, in the way that seatbelts are not strictly necessary. But they can really save you from doing something stupid that messes up your database.

It's so much nicer to debug a FK constraint error than have to reconstruct a delete that broke your application.

They are important, because your application is not the only way data can be manipulated in the database. Your application may handle referential integrity as honestly as it wants, but all it takes is one bozo with the right privileges to come along and issue an insert, delete or update command at the database level, and all your application referential integrity enforcement is bypassed. Putting FK constraints in at the database level means that, barring this bozo choosing to disable the FK constraint before issuing their command, the FK constraint will cause a bad insert/update/delete statement to fail with a referential integrity violation.

I think about it in terms of cost/benefit... In MySQL, adding a constraint is a single additional line of DDL. It's just a handful of key words and a couple of seconds of thought. That's the only "cost" in my opinion...

Tools love foreign keys. Foreign keys prevent bad data (that is, orphaned rows) that may not affect business logic or functionality and therefor go unnoticed, and build up. It also prevents developers who are unfamiliar with the schema from implementing entire chunks of work without realizing they're missing a relationship. Perhaps everything is great within the scope of your current application, but if you missed something and someday something unexpected is added (think fancy reporting), you might be in a spot where you have to manually clean up bad data that's been accumulating since the inception of the schema without a database enforced check.

The little time it takes to codify what's already in your head when you're putting things together could save you or someone else a bunch of grief months or years down the road.

The question:

Are there any other uses for foreign keys? Am I missing something here?

It is a bit loaded. Insert comments, indentation or variable naming in place of "foreign keys"... If you already understand the thing in question perfectly, it's "no use" to you.

Entropy reduction. Reduce the potential for chaotic scenarios to occur in the database. We have a hard time as it is considering all the possiblilites so, in my opinion, entropy reduction is key to the maintenance of any system.

When we make an assumption for example: each order has a customer that assumption should be enforced by something. In databases that "something" is foreign keys.

I think this is worth the tradeoff in development speed. Sure, you can code quicker with them off and this is probably why some people don't use them. Personally I have killed a number of hours with NHibernate and some foreign key constraint that gets angry when I perform some operation. HOWEVER, I know what the problem is so it's less of a problem. I'm using normal tools and there are resources to help me work around this, possibly even people to help!

The alternative is allow a bug to creep into the system (and given enough time, it will) where a foreign key isn't set and your data becomes inconsistent. Then, you get an unusual bug report, investigate and "OH". The database is screwed. Now how long is that going to take to fix?

You can view foreign keys as a constraint that,

  • Help maintain data integrity
  • Show how data is related to each other (which can help in enforcing business logic and rules)
  • If used correctly, can help increase the efficiency with which the data is fetched from the tables.

We don't currently use foreign keys. And for the most part we don't regret it.

That said - we're likely to start using them a lot more in the near future for several reasons, both of them for similar reasons:

  1. Diagramming. It's so much easier to produce a diagram of a database if there are foreign key relationships correctly used.

  2. Tool support. It's a lot easier to build data models using Visual Studio 2008 that can be used for LINQ to SQL if there are proper foreign key relationships.

So I guess my point is that we've found that if we're doing a lot of manual SQL work (construct query, run query, blahblahblah) foreign keys aren't necessarily essential. Once you start getting into using tools, though, they become a lot more useful.

The best thing about foreign key constraints (and constraints in general, really) are that you can rely on them when writing your queries. A lot of queries can become a lot more complicated if you can't rely on the data model holding "true".

In code, we'll generally just get an exception thrown somewhere - but in SQL, we'll generally just get the "wrong" answers.

In theory, SQL Server could use constraints as part of a query plan - but except for check constraints for partitioning, I can't say that I've ever actually witnessed that.

Foreign keys had never been explicit (FOREIGN KEY REFERENCES table(column)) declared in projects (business applications and social networking websites) which I worked on.

But there always was a kind of convention of naming columns which were foreign keys.

It's like with database normalization -- you have to know what are you doing and what are consequence of that (mainly performance).

I am aware of advantages of foreign keys (data integrity, index for foreign key column, tools aware of database schema), but also I am afraid of using foreign keys as general rule.

Also various database engines could serve foreign keys in a different way, which could lead to subtle bugs during migration.

Removing all orders and invoices of deleted client with ON DELETE CASCADE is the perfect example of nice looking, but wrong designed, database schema.

Yes. The ON DELETE [RESTRICT|CASCADE] keeps developers from stranding data, keeping the data clean. I recently joined a team of Rails developers who did not focus on database constraints such as foreign keys.

Luckily, I found these: http://www.redhillonrails.org/foreign_key_associations.html -- RedHill on Ruby on Rails plug-ins generate foreign keys using the convention over configuration style. A migration with product_id will create a foreign key to the id in the products table.

Check out the other great plug-ins at RedHill, including migrations wrapped in transactions.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top