I am building a star schema to act as the backend for an analytics app I am building. My query generator is building queries using a regular star-join pattern. A sample query is below, whereby a fact table is joined to two dimension tables and the dimension tables are filtered by constant values chosen by the end user.
I am using MySQL 5.5 and all tables are MyISAM.
In this problem, I am simply trying to pull the first N rows (in this case, the first 1 row)
EXPLAIN
SELECT fact_table.*
FROM
fact_table
INNER JOIN
dim1 ON (fact_table.dim1_key = dim1.pkey)
INNER JOIN
dim2 ON (fact_table.dim2_key = dim2.pkey)
WHERE
dim1.constant_value = 123
AND dim2.constant_value = 456
ORDER BY
measure1 ASC LIMIT 1
The explain output follows. Both the dimension keys resolve to constant values since there is a unique key applied to their value.
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: dim1
type: const
possible_keys: PRIMARY,dim1_uk
key: dim1_uk
key_len: 8
ref: const
rows: 1
Extra: Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: dim2
type: const
possible_keys: PRIMARY,dim2_uk
key: dim2_uk
key_len: 8
ref: const
rows: 1
Extra:
*************************** 3. row ***************************
id: 1
select_type: SIMPLE
table: fact_table
type: ref
possible_keys: my_idx
key: my_idx
key_len: 16
ref: const,const
rows: 50010
Extra: Using where
And here is the index on the fact table:
show indexes from fact_table
*************************** 10. row ***************************
Table: fact_table
Non_unique: 1
Key_name: my_idx
Seq_in_index: 1
Column_name: dim1_key
Collation: A
Cardinality: 24
Sub_part: NULL
Packed: NULL
Null:
Index_type: BTREE
Comment:
Index_comment:
*************************** 11. row ***************************
Table: fact_table
Non_unique: 1
Key_name: my_idx
Seq_in_index: 2
Column_name: dim2_key
Collation: A
Cardinality: 70
Sub_part: NULL
Packed: NULL
Null:
Index_type: BTREE
Comment:
Index_comment:
*************************** 12. row ***************************
Table: fact_table
Non_unique: 1
Key_name: my_idx
Seq_in_index: 3
Column_name: measure1
Collation: A
Cardinality: 5643
Sub_part: NULL
Packed: NULL
Null:
Index_type: BTREE
Comment:
Index_comment:
When profiling this query, I see the query spends the majority of its time performing a filesort operation "sorting result". My question is, even when using the correct index, why can't this query simply pull out the first value without doing a sort? The my_idx is already sorted on the right column and the two columns appearing first in the index resolve as constants, as shown in the plan.
If I rewrite the query, as follows, I am able to get the plan I want, with no file sorting.
SELECT fact_table.*
FROM
fact_table
WHERE
dim1_key = (select pkey from dim1 where constant_value = 123)
AND dim2_key = (select pkey from dim2 where constant_value = 456)
ORDER BY
measure1 ASC LIMIT 1
It would be expensive to change the tool generating these SQL commands so I would like to avoid this filesort even when the query is written in the original format.
My question is, why is MySQL opting to do a filesort even when the first keys on the index are constants (via an INNER JOIN) and the index is sorted in the right order? Is there a way around this?