Question

I see tons of questions about unique constrains on multiple questions but none that match what I am specifically looking for. If this is a duplicate of one, I apologies.

I have a table that is just: tableA_id tableB_id

My primary key is a unique constraint on both tables, and i have a index on both columns. Both are also primary keys to their respected tables.

If tableA is likely to have say 10,000,000 rows and table B to have say 2,000,000 rows, it is more likely that TableB will be in this constraint far less times. That being sad, is it more optimized when I am making my unique constraint to put TableB as the first column since there are less to search for, TableA (if so why), or it makes no difference as it does not search one first then the other, rather goes 1 by one looking at both.

Thanks in advance

Was it helpful?

Solution

It is usually recommended to put a column with more distinct values on the left in a composite index. That results in a more selective index, which is better for finding a specific value.

A quote form MySQL docs:

To eliminate rows from consideration. If there is a choice between multiple indexes, MySQL normally uses the index that finds the smallest number of rows (the most selective index)

But I have an impression that you seem to be trying to optimize failures on inserts to the table. And if you have more writes to the table than reads and most of the writes are duplicates, then you are probably right. But even in the latter case, MySql will need to check the other column for uniqueness. Thus, it is still better to put first a column with more distinct values.

OTHER TIPS

From your description I assume you have the following:

UNIQUE (tableA_id, tableB_id)
INDEX (tableA_id)
INDEX (tableB_id

In that case the single-column index on tableA_id is not necessary because any statement that could make use of that one, can also use the index from the primary key. So you can at least drop the single-column index on tableA_id.

I don't think MySQL's optimizer is smart enough to use the PK index for a statement that contains WHERE tableB_id = 42.

So you probably want to keep the single column index on that column if you use that ID as a single criteria in your statements.

If you always query that table using both IDs, there is no need to keep the single column index.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top