Question

There is a table that contains more id data than real data data.

user_id int unsigned NOT NULL,
project_id int unsigned NOT NULL,
folder_id int unsigned NOT NULL,
file_id int unsigned NOT NULL,
data TEXT NOT NULL

The only way to create a unique primary key for this table would be a composite of (user_id, project_id, folder_id, file_id). I have frequently seen 2 column composite primary keys, but is it ok to have 4 or even more? According to MySQL: "All storage engines support at least 16 indexes per table and a total index length of at least 256 bytes. Most storage engines have higher limits.", so I know at least it is possible to do.

Past this, there are frequent queries to this table for various combinations of these ids. For example, find all projects for user X, find all files for user X, find all files for project Y and folder Z, etc. Should there be a separate individual index key on each of the id columns, or if there is a composite primary key that already contains all the columns does this make further individual keys redundant? There will be about 10 million - 50 million rows in the table at any time.

To summarize: is it ok to have a composite primary key with 4 (or more) id columns, and if there is a composite key does it make additional individual keys for each of those columns redundant?

Was it helpful?

Solution

Yes, it is ok to have a composite primary key with 4 or more columns.

It doesn't necessarily make additional keys for each of those columns redundant. For example, a key (a, b, c) will not be useful for a query SELECT ... WHERE b = 4. For that type of query you would rather have key (b) or key (b, c).

You need to examine your expected queries to determine which indexes you'll need. See this talk for more details: http://youtu.be/AVNjqgf7zNw

OTHER TIPS

Yes this is OK if the data model supports it. You haven't shared much about your overall DB schema and how these items related to each other to determine if this might be considered the best approach. In other words is this truly the only way in which these for items are related to each other, or for example are the files REALLY related to projects and projects related to users or something like that such the splitting up these joins tables makes more logical sense.

If you are querying individual columns within this primary key, this might suggest to me that your schema is not quite correct. At a minimum you might need to add individual index on these columns to support such a query.

You're going to regret creating a compound primary key, it becomes really obnoxious to address individual rows and derivative indexes in MySQL must contain the primary key as a row identifier. You can create a UNIQUE that's compound, though.

You can have a composite key with a fairly large number of components, though keep in mind the more you add the bigger the index will get and the slower it will be to update when you do an INSERT. As your database grows in size, insert operations may get cripplingly slow.

This is why, whenever possible, you should try and minimize your index size.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top