Domanda

I'm new to query optimizations so I accept I don't understand everything yet but I do not understand why even this simple query isn't optimized as expected.

My table:

+------------------+-----------+------+-----+-------------------+----------------+
| Field            | Type      | Null | Key | Default           | Extra          |
+------------------+-----------+------+-----+-------------------+----------------+
| tasktransitionid | int(11)   | NO   | PRI | NULL              | auto_increment |
| taskid           | int(11)   | NO   | MUL | NULL              |                |
| transitiondate   | timestamp | NO   | MUL | CURRENT_TIMESTAMP |                |
+------------------+-----------+------+-----+-------------------+----------------+

My indexes:

+-----------------+------------+-------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table           | Non_unique | Key_name          | Seq_in_index | Column_name      | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------+------------+-------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| tasktransitions |          0 | PRIMARY           |            1 | tasktransitionid | A         |         952 |     NULL | NULL   |      | BTREE      |         |               |
| tasktransitions |          1 | transitiondate_ix |            1 | transitiondate   | A         |         952 |     NULL | NULL   |      | BTREE      |         |               |
+-----------------+------------+-------------------+--------------+------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

My query:

SELECT taskid FROM tasktransitions WHERE transitiondate>'2013-09-31 00:00:00';

gives this:

+----+-------------+-----------------+------+-------------------+------+---------+------+------+-------------+
| id | select_type | table           | type | possible_keys     | key  | key_len | ref  | rows | Extra       |
+----+-------------+-----------------+------+-------------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | tasktransitions | ALL  | transitiondate_ix | NULL | NULL    | NULL | 1082 | Using where |
+----+-------------+-----------------+------+-------------------+------+---------+------+------+-------------+

If I understand everything correctly Using where and ALL means that all rows are retrieved from the storage engine and filtered at server layer. This is sub-optimal. Why does it refuse to use the index and only retrieve the requested range from the storage engine (innoDB)?

Cheers

È stato utile?

Soluzione

MySQL will not use the index if it estimates that it would select a significantly large portion of the table, and it thinks that a table-scan is actually more efficient in those cases.

By analogy, this is the reason the index of a book doesn't contain very common words like "the" -- because it would be a waste of time to look up the word in the index and find the list of page numbers is a very long list, even every page in the book. It would be more efficient to simply read the book cover to cover.

My experience is that this happens in MySQL if a query's search criteria would match greater than 20% of the table, and this is usually the right crossover point. There could be some variation based on the data types, size of table, etc.

You can give a hint to MySQL to convince it that a table-scan would be prohibitively expensive, so it would be much more likely to use the index. This is not usually necessary, but you can do it like this:

SELECT taskid FROM tasktransitions FORCE INDEX (transitiondate_ix)
WHERE transitiondate>'2013-09-31 00:00:00';

Altri suggerimenti

Just in case, it helps somebody.

I have a table with a varchar column _id (long int coded as string). I added an index for this column, but query was still slow. I was executing this query:

select * from table where (_id = 2221835089) limit 1

I realized that the _id column wasn't been generated as string (I'm Laravel as DB framework). Well, if query is executed with the right data type in the where clause everything worked like a charm:

select * from table where (_id = '2221835089') limit 1

I am new at my MySQL 8.0, have finished 2 simple tutorials completely, and there is only two subjects that has not worked for me, one of them is indexing. I read the section labeled "2 Answers" and found that using the statement suggested at the end of said section, seems to defeat the purpose of the original USE INDEX or FORCE INDEX statement below. The suggested statement is like getting a table sorted via a WHERE statement instead of MySQL using USE INDEX or FORCE INDEX. It works, but seems to me it is not the same as using the natural USE INDEX or FORCE INDEX. Does any one knows why MySQL is ignoring my simple request to index a 10 row table on the Lname column?

Field Type Null Key Default Extra
ID int NO PRI Null auto_increment
Lname varchar(20) NO MUL Null
Fname varchar(20) NO Mul Null
City varchar(15) NO Null
Birth_Date date NO Null
CREATE INDEX idx_Lname ON TestTable (Lname);
SELECT * FROM TestTable USE INDEX (idx_Lname);
SELECT * From Testtable FORCE INDEX (idx_LastFirst); 

I once was trying to join two tables and MySQL was refusing to use an index, resulting in >500ms queries, sometimes a few seconds. Turns out the column I was joining on had different encodings on each table. Changing both to the same encoding sped up the query to consistently less than 100ms.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top