Separate Join clause in a Composite Index

https://stackoverflow.com/questions/13165352

23-07-2021
|

質問

Would having a Composite Index be beneficial for something like this:

SELECT * FROM a INNER JOIN b ON(a.id=b.id)
                INNER JOIN c ON(a.bar=c.id)
                INNER JOIN d ON(a.foo=d.id)

Index would be:

(a.id, a.bar, a.foo)

解決

Only the leading edge of the index would be used (a.id), so only the INNER JOIN to b would benefit from the index... so the additional columns in the index (a.bar and a.foo) are not beneficial in the sample query posted.

From the MySql documentation:

MySQL cannot use the index to perform lookups if the columns do not form a leftmost prefix of the index. Suppose that you have the SELECT statements shown here:
SELECT * 
FROM tbl_name 
WHERE col1=val1; 

SELECT * 
FROM tbl_name 
WHERE col1=val1 AND col2=val2;

SELECT * 
FROM tbl_name 
WHERE col2=val2; 

SELECT * 
FROM tbl_name 
WHERE col2=val2 AND col3=val3; 
If an index exists on (col1, col2, col3), only the first two queries use the index. The third and fourth queries do involve indexed columns, but (col2) and (col2, col3) are not leftmost prefixes of (col1, col2, col3).

他のヒント

When JOINing (that is, INNER JOIN, not LEFT JOIN), the Optimizer will pick any of the tables as the "first" table. Then it will use the ON clause to move on to the next table and do a "NLJ" (Nested Loop Join) to reach into that second table. Etc.

Usually, when there is a WHERE clause (or certain other things), that will determine which table is "first". Without that, the Optimizer will usually pick the smallest table.

Given a WHERE clause on one of the tables, it will look for the 'best' INDEX to use. Such an index is likely to have one or more columns involved in the WHERE clause. Two indexes will not (with rare exceptions) be used at the same time for a single table.

When moving on to the 'next' table, probably an index on the column(s) in the ON clause determines the optimal index. Note that if you have ON a.x=b.x AND a.y=b.y, then the composite INDEX(x,y) (either order) is desirable.

Back to the case of not deciding on which table to "start" with... The Optimizer will try a variety of orders for the table. With each different ordering, there is a different set of indexes.

For this

     FROM a 
     INNER JOIN b ON(a.id=b.id)
     INNER JOIN c ON(a.bar=c.id)
     INNER JOIN d ON(a.foo=d.id)

These are optimal, but we can't necessarily predict which will be used:

a:  (id)  -- if coming from b
a:  (bar) -- if coming from c
a:  (foo) -- if coming from d
b:  (id)  -- if coming from b 
c:  (id)  -- if coming from a
d:  (id)  -- if coming from a

Note: You probably already have PRIMARY KEY(id) in each of the tables, so there is no need to add INDEX(id).

You said SELECT *, meaning you wanted all the columns from all four tables. If that is not really the case, we can discuss another optimization: "covering indexes".

More discussion: http://mysql.rjweb.org/doc.php/index_cookbook_mysql

ライセンス： CC-BY-SA と帰属

所属していません StackOverflow