Question

I've got a problem with a MySQL query where the wrong (inefficient) index is used.

The table:

mysql> describe ADDRESS_BOOK;
+---------------+--------------+------+-----+---------+----------------+
| Field         | Type         | Null | Key | Default | Extra          |
+---------------+--------------+------+-----+---------+----------------+
| ADD_BOOK_ID   | bigint(20)   | NO   | PRI | NULL    | auto_increment |
| COMPANY_ID    | bigint(20)   | NO   | MUL | NULL    |                |
| ADDRESS_NAME  | varchar(150) | NO   | MUL | NULL    |                |
| CLEAN_NAME    | varchar(150) | NO   | MUL | NULL    |                |
| ADDRESS_KEY_1 | varchar(150) | NO   | MUL | NULL    |                |
| ADDRESS_KEY_2 | varchar(150) | NO   | MUL | NULL    |                |
+---------------+--------------+------+-----+---------+----------------+

CLEAN_NAME is a 'cleaned' version of the normal ADDRESS_NAME where everything but [a-zA-Z] has been removed, ADDRESS_KEY1 and ADDRESS_KEY2 are the two longest words in ADDRESS_NAME, again everything but [a-zA-Z] removed.

These are my indexes (playing around with it trying to find the best):

mysql> SHOW INDEX FROM ADDRESS_BOOK;
+--------------+------------+-------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table        | Non_unique | Key_name          | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------------+------------+-------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| ADDRESS_BOOK |          0 | PRIMARY           |            1 | ADD_BOOK_ID   | A         |       37847 |     NULL | NULL   |      | BTREE      |         |               |
| ADDRESS_BOOK |          1 | FK_ADDRESS_BOOK_2 |            1 | COMPANY_ID    | A         |          36 |     NULL | NULL   |      | BTREE      |         |               |
| ADDRESS_BOOK |          1 | IDX_ADDRESS_NAME  |            1 | ADDRESS_NAME  | A         |       37847 |     NULL | NULL   |      | BTREE      |         |               |
| ADDRESS_BOOK |          1 | FX_ADDRESS_KEYS   |            1 | CLEAN_NAME    | A         |       37847 |     NULL | NULL   |      | BTREE      |         |               |
| ADDRESS_BOOK |          1 | FX_ADDRESS_KEYS   |            2 | ADDRESS_KEY_1 | A         |       37847 |     NULL | NULL   |      | BTREE      |         |               |
| ADDRESS_BOOK |          1 | FX_ADDRESS_KEYS   |            3 | ADDRESS_KEY_2 | A         |       37847 |     NULL | NULL   |      | BTREE      |         |               |
| ADDRESS_BOOK |          1 | FX_ADDRESS_KEYS   |            4 | COMPANY_ID    | A         |       37847 |     NULL | NULL   |      | BTREE      |         |               |
| ADDRESS_BOOK |          1 | FK_ADDRESS_2      |            1 | ADDRESS_KEY_2 | A         |       18923 |     NULL | NULL   |      | BTREE      |         |               |
| ADDRESS_BOOK |          1 | FK_CLEAN          |            1 | CLEAN_NAME    | A         |       37847 |     NULL | NULL   |      | BTREE      |         |               |
| ADDRESS_BOOK |          1 | FK_ADDRESS_1      |            1 | ADDRESS_KEY_1 | A         |       37847 |     NULL | NULL   |      | BTREE      |         |               |
+--------------+------------+-------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

Now my query is:

select * from ADDRESS_BOOK addressboo0_ 
where (addressboo0_.CLEAN_NAME like concat('trad', '%') 
or addressboo0_.ADDRESS_KEY_1 like concat('trad', '%') 
or addressboo0_.ADDRESS_KEY_2 like concat('trad', '%')) 
and addressboo0_.COMPANY_ID=1 
order by addressboo0_.CLEAN_NAME asc 
limit 200

There are users from different companies in the system, so a query should only return address book entries for the company of the user.

The explain for that is

+----+-------------+--------------+------+----------------------------------------------------------------------+-------------------+---------+-------+------+-----------------------------+
| id | select_type | table        | type | possible_keys                                                        | key               | key_len | ref   | rows | Extra                       |
+----+-------------+--------------+------+----------------------------------------------------------------------+-------------------+---------+-------+------+-----------------------------+
|  1 | SIMPLE      | addressboo0_ | ref  | FK_ADDRESS_BOOK_2,FX_ADDRESS_KEYS,FK_ADDRESS_2,FK_CLEAN,FK_ADDRESS_1 | FK_ADDRESS_BOOK_2 | 8       | const | 4108 | Using where; Using filesort |
+----+-------------+--------------+------+----------------------------------------------------------------------+-------------------+---------+-------+------+-----------------------------+

I know that MySQL can't use multicolumn indexes on or queries but as you can see it is using the index for COMPANY (FK_ADDRESS_BOOK_2) and not any of the indexes for the string columns!

If I take the company out from the query it will use the other indexes:

+----+-------------+--------------+-------------+----------------------------------------------------+------------------------------------+-------------+------+------+-----------------------------------------------------------------------------------+
| id | select_type | table        | type        | possible_keys                                      | key                                | key_len     | ref  | rows | Extra                                                                             |
+----+-------------+--------------+-------------+----------------------------------------------------+------------------------------------+-------------+------+------+-----------------------------------------------------------------------------------+
|  1 | SIMPLE      | addressboo0_ | index_merge | FX_ADDRESS_KEYS,FK_ADDRESS_2,FK_CLEAN,FK_ADDRESS_1 | FK_CLEAN,FK_ADDRESS_1,FK_ADDRESS_2 | 452,452,452 | NULL | 1089 | Using sort_union(FK_CLEAN,FK_ADDRESS_1,FK_ADDRESS_2); Using where; Using filesort |
+----+-------------+--------------+-------------+----------------------------------------------------+------------------------------------+-------------+------+------+-----------------------------------------------------------------------------------+

If I use the same query (incl company) for a different company it suddenly uses the multi-column index:

+----+-------------+--------------+-------+----------------------------------------------------------------------+-----------------+---------+------+------+-------------+
| id | select_type | table        | type  | possible_keys                                                        | key             | key_len | ref  | rows | Extra       |
+----+-------------+--------------+-------+----------------------------------------------------------------------+-----------------+---------+------+------+-------------+
|  1 | SIMPLE      | addressboo0_ | index | FK_ADDRESS_BOOK_2,FX_ADDRESS_KEYS,FK_ADDRESS_2,FK_CLEAN,FK_ADDRESS_1 | FX_ADDRESS_KEYS | 1364    | NULL |  492 | Using where |
+----+-------------+--------------+-------+----------------------------------------------------------------------+-----------------+---------+------+------+-------------+

So for company 1 it has 266 results while for company 16 it has 437. In total company 1 has 4109 entries while company 16 has 7745 entries.

So I am rather confused. Why is MySQL using the multi-column index FX_ADDRESS_KEYS for one company but the rather inefficient FK_ADDRESS_BOOK_2 for the other company (basically going through every single row for that company).

How can I improve the query/index? If I remove the or for ADDRESS_KEY_1 and ADDRESS_KEY_2 it is using the FX_ADDRESS_KEYS index but I lose the ability to search for Strings inside the name. If I use something like '%trade%' no index can be used.

Était-ce utile?

La solution

If you want to have a pretty looking explain plan for this query, then try this:

CREATE INDEX FX_ADDRESS_KEYS_XX  ON ADDRESS_BOOK( 
         COMPANY_ID, 
         CLEAN_NAME, 
         ADDRESS_KEY_1, 
         ADDRESS_KEY_2 );

This index should improve the query, but at some costs.
It contains a copy of almost the whole table (except 2 columns: ADD_BOOK_ID bigint(20) and ADDRESS_NAME varchar(150)) - it will take quite a lot of disk space.
And it for sure slow down inserts and updates, since index data must also be updated.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top