Question

I've created many databases before, but I have never linked two tables together. I've tried looking around, but cannot find WHY one would need to link two or more tables together.

There is a good tutorial here that goes over database relationships, but does not explain why they would be needed. He just simply says that they are.

Are they truly necessary? I understand that (in his example) all orders have a customer, and so one would link the orders table to the customers table, but I just don't see why this would be absolutely necessary. I can (and have) created shopping carts and other complex databases that work just fine without creating any table relationships.

I've just started playing around with MySQL Workbench v6.0 for a new project that has a fairly large and complex database, and so I'm wondering if I am losing anything by creating the entire project without relationships?

NOTE: Please let me know if this question is too general or off topic, and I will change it. I understand that a lot can be said about this topic, and so I'm really just looking to know if I am opening myself up to any security issues or significant performance issues by not using relationships. Please be specific in your response; "Yes you are opening yourself up to performance issues" is useless and not helpful for myself, nor for anyone else looking at this thread at a later date. Please include details and specifics in your response.

Thank you in advance!

Was it helpful?

Solution 2

The answer revolves around granularity, space consumption, speed, and detail.

Inherently different types of data will be more granular than others, as items can always be rolled up to a larger umbrella. For a chain of stores, items sold can be rolled up into transactions, transactions can be rolled up into register batches, register batches can be rolled up to store sales, store sales can be rolled up to company sales. The two options then are:

  1. Store the data at the lowest grain in a single table
  2. Store the data in separate tables that are dedicated to purpose

In the first case, there would be a lot of redundant data, as each item sold at location 3 of 430 would have store, date, batch, transaction, and item information. That redundant data takes up a large volume of space, when you could very easily create separated tables for their unique purpose.

In this example, lets say there were a thousand transactions a day totaling a million items sold from that one store. By creating separate tables you would have:

  • Stores = 430 records
  • Registers = 10 records
  • Transactions = 1000 records
  • Items sold = 1000000 records

I'm sure your asking where the space savings comes in ... it is in the detail for each record. The store table has names, address, phone, etc. The register has number, purchase date, manager who reconciles, etc. Transactions have customer, date, time, amount, tax, etc. If these values were duplicated for every record over a single table it would be a massive redundancy of data adding up to far more space consumption than would occur just by linking a field in one table (transaction id) to a field in another table (item id) to show that relationship.

Additionally, the amount of space consumed, as well as the size of the overall table, inversely impacts the speed of you querying that data. By keeping tables small and capitalizing on the relationship identifiers to link between them, you can greatly increase the response time. Every time the query engine needs to find a value, it traverses the table until it finds it (that is a grave oversimplification, but not untrue), so the larger and broader the table the longer the seek time. These problems do not exist with insignificant volumes of data, but for organizations that deal with millions, billions, trillions of records (I work for one of them) storing everything in a single table would make the application unusable.

There is so very, very much more on this topic, but hopefully this gives a bit more insight.

OTHER TIPS

As Sam D points out in the comments, entire books can be written about database design and why having tables with relationships can make a lot of sense.

That said, theoretically, you lose absolutely no expressive/computational power by just putting everything in the same table. The primary arguments against doing so likely deal with performance and maintenance issues that might arise.

Short answer: In a relational database like MySQL Yes. Check this out about referential integrity http://databases.about.com/cs/administration/g/refintegrity.htm

That does not mean that you have to use relational databases for your project. In fact the trend is to use Non-Relational databases (NoSQL), like MongoDB to achieve same results with better performance. More about RDBMS vs NoSQL http://www.zdnet.com/rdbms-vs-nosql-how-do-you-pick-7000020803/

I think that with this example you will understand better:

Let's we want to create on-line store. We have at minimum Users, Payments and Events (events about the pages where the user navigates or other actions). In this scenario we want to link in a secure and relational way the Users with the Payments. We do not want a Payment to be lost or assigned to another User. So we can use a RDBMS like MySQL to create the tables Users and Payments and linked the with proper Foreign Keys. However for the events, we are going to be a lot of them per users (maybe millions) and we need to track them in a fast way without killing the relation database. In that case a No-SQL database like MongoDB makes totally sense.

To sum up to can use an hybrid of SQL and NO-SQL, but either if you use one, the other or both kind of solutions, do it properly.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top