Question

I'm doing some php/mysql work and started to wonder what would be more efficient and what method would have better data integrity.

I've never used method #2 but i have seen it used in systems like a CMS or eCommerce. I use #1 regularly.

Example: When creating a "task" in my system I need to assign 1 OR multiple users to it.


Method #1 In this method I would have a table which would store both task_id and user_id. I would query this table to get the relationship.


Method #2 In this method I would have a column in the task table "users_assigned" this would store a serialized array which I would unserialize when I needed to. e.g.

$data = array('John', 'Jack', 'Jill');
// after serialization it would look like...
// a:3:{i:0;s:4:"John";i:1;s:4:"Jack";i:2;s:4:"Jill";}

What method is best for storing this type of data in a database?

Was it helpful?

Solution

If you do not have a strong reason against denormalization (storing the data duplicated in a serialized array, breaking rules of normal forms), I'd stick to separate tables. Fetching is a bit more complicating, needing joins, but you can control relation integrity better in the database (foreign keys), your data is better organized and more clearly defined.

Take a look at this article http://backchannel.org/blog/friendfeed-schemaless-mysql for an extreme case of denormalization. In their case however, they don't fetch any additional linked data to the rowsets (in your case, you probably want to join more info about the users in the serialized array).

OTHER TIPS

Method 2 is a bad idea. Pretty much prevents you joining tables (it is possible still, but only with the kind of code you do to prove it can be done rather than to actually use).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top