Question

I'm encountering a strange behavior of MySQL. Query execution (i.e. the usage of indexes as shown by explain [QUERY]) and time needed for execution are dependent on the elements of the where clause.

Here is a query where the problem occurs:

select distinct
e1.idx, el1.idx, r1.fk_cat, r2.fk_cat
from ent e1, ent_leng el1, rel_c r1, _tax_c t1, rel_c r2, _tax_c t2
where el1.fk_ent=e1.idx
and r1.fk_ent=e1.idx and ((r1.fk_cat=43) or (r1.fk_cat=t1.fk_cat1 and t1.fk_cat2=43))
and r2.fk_ent=e1.idx and ((r2.fk_cat=10) or (r2.fk_cat=t2.fk_cat1 and t2.fk_cat2=10))

The corresponding explain output is:

| id | select_type | table | type   | possible_keys           | key     | key_len | ref           | rows  | Extra                       
+----+-------------+-------+--------+-------------------------+---------+---------+---------------+-------+------------------------------------
|  1 | SIMPLE      | el1   | index  | fk_ent                  | fk_ent  | 4       | NULL          | 15002 | Using index; Using temporary
|  1 | SIMPLE      | e1    | eq_ref | PRIMARY                 | PRIMARY | 4       | DB.el1.fk_ent |     1 | Using index
|  1 | SIMPLE      | r1    | ref    | fk_ent,fk_cat,fks       | fks     | 4       | DB.e1.idx     |     1 | Using where; Using index
|  1 | SIMPLE      | r2    | ref    | fk_ent,fk_cat,fks       | fks     | 4       | DB.el1.fk_ent |     1 | Using index
|  1 | SIMPLE      | t1    | index  | fk_cat1,fk_cat2,fk_cats | fk_cats | 8       | NULL          |    69 | Using where; Using index; Distinct; 
|    |             |       |        |                         |         |         |               |       | Using join buffer
|  1 | SIMPLE      | t2    | index  | fk_cat1,fk_cat2,fk_cats | fk_cats | 8       | NULL          |    69 | Using where; Using index; Distinct;  
                                                                                                          | Using join buffer

As you can see a one-column index has the same name as the column it belongs to. I also added some useless indexes along with the used ones, just to see if they change the execution (which they don't).

The execution takes ~4.5 seconds.

When I add the column entl1.name to the select part (nothing else changed), the index fk_ent in el1 cannot be used any more:

| id | select_type | table | type   | possible_keys           | key     | key_len | ref           | rows  | Extra                       
+----+-------------+-------+--------+-------------------------+---------+---------+---------------+-------+------------------------------------
|  1 | SIMPLE      | el1   | ALL    | fk_ent                  |  NULL   | NULL    | NULL          | 15002 | Using temporary

The execution now takes ~8.5 seconds.

I always thought that the select part of a query does not influence the usage of indexes by the engine and doesn't affect performance in such a way.

Leaving out the attribute isn't a solution, and there are even more attributes that i have to select. Even worse, the query in the used form is even a bit more complex and that makes the performance issue a big problem.

So my questions are: 1) What is the reason for this strange behavior? 2) How can I solve the performance problem?

Thanks for your help! Gred

Was it helpful?

Solution

It's the DISTINCT restriction. You can think of that as another WHERE restriction. When you change the select list, you are really changing the WHERE clause for the DISTINCT restriction, and now the optimizer decides that it has to do a table scan anyway, so it might as well not use your index.

EDIT:

Not sure if this helps, but if I am understanding your data correctly, I think you can get rid of the DISTINCT restriction like this:

select
e1.idx, el1.idx, r1.fk_cat, r2.fk_cat
from ent e1
  Inner Join ent_leng el1 ON el1.fk_ent=e1.idx
  Inner Join rel_c r1 ON r1.fk_ent=e1.idx
  Inner Join rel_c r2 ON r2.fk_ent=e1.idx
where 
 ((r1.fk_cat=43) or Exists(Select 1 From _tax_c t1 Where r1.fk_cat=t1.fk_cat1 and t1.fk_cat2=43)) 
 and 
 ((r2.fk_cat=10) or Exists(Select 1 From _tax_c t2 Where r2.fk_cat=t2.fk_cat1 and t2.fk_cat2=10))

OTHER TIPS

MySQL will return data from an index if possible, saving the entire row from being loaded. In this way, the selected columns can influence the index selection.

With this in mind, it can much more efficient to add all required columns to an index, especially in the case of only selecting a small subset of columns.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top