質問

I created a table partitioned on a numeric ID:

CREATE TABLE mytable (
...
`id` int(11) DEFAULT NULL
...
) ENGINE=InnoDB DEFAULT CHARSET=latin1 PARTITION BY HASH (`id`) PARTITIONS 100

I have no primary key, but a number of indices. I don't have any data in my table where id is less than 0 or greater than 30 (at the moment, I expect this to grow). Most of my queries first include the id to reduce the search space.

I figured a query to select distinct(id) from mytable would then just return the number of partitions that had data in it. I was surprised that an explain on this instead does a full scan of the data:

explain partitions select distinct(id) from mytable;

|  1 | SIMPLE      | mytable | p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16,p17,p18,p19,p20,p21,p22,p23,p24,p25,p26,p27,p28,p29,p30,p31,p32,p33,p34,p35,p36,p37,p38,p39,p40,p41,p42,p43,p44,p45,p46,p47,p48,p49,p50,p51,p52,p53,p54,p55,p56,p57,p58,p59,p60,p61,p62,p63,p64,p65,p66,p67,p68,p69,p70,p71,p72,p73,p74,p75,p76,p77,p78,p79,p80,p81,p82,p83,p84,p85,p86,p87,p88,p89,p90,p91,p92,p93,p94,p95,p96,p97,p98,p99 | ALL  | NULL          | NULL | NULL    | NULL | 24667132 | Using temporary |

explain select distinct(id) from mytable;
+----+-------------+----------------------+------+---------------+------+---------+------+----------+-----------------+
| id | select_type | table                | type | possible_keys | key  | key_len | ref  | rows     | Extra           |
+----+-------------+----------------------+------+---------------+------+---------+------+----------+-----------------+
|  1 | SIMPLE      | mytable              | ALL  | NULL          | NULL | NULL    | NULL | 24667132 | Using temporary |
+----+-------------+----------------------+------+---------------+------+---------+------+----------+-----------------+

I then read this stackoverflow answer which enlightened how MySQL's partition hash() function works.

My question is, how can I get MySQL to map each id in the table into its own partition such that selects with the id narrow the search to a single table (and a select distinct() just has to count the number of partitions and not scan them)?

I'm using Server version: 5.5.35-0ubuntu0.12.04.2 (Ubuntu).

役に立ちましたか?

解決

First off, your conflating two different things. One is the fact that a SELECT WHERE id = ? should only search one partition. Something which you mentioned but didn't specify whether it currently works or not (given your table definition, I don't see why it shouldn't).

The second thing, having a SELECT distinct(id) to only touch the partitioning information, is very different from this. However, if I understand you correctly, you're assuming that one partition only has one kind of id. That is not how HASH partitioning works, though. It works similar to a traditional hash-table, by mapping a large key space to a small one, in your case, 100. So each partition will have many possible IDs. Since mysql will not keep track which of the possible IDs are really in one partition all it can do is to scan each partition, do the DISTINCT, and give back the result. That said, it could to do the DISTINCT operation on the individual partitions instead of the whole table and it could do this in parallel, however, the explain seems to imply that it will create one big temporary to do the DISTINCT, likely because this optimization hasn't been implemented yet.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top