質問

Our customers are all part of the same database, where we store all of their images, blog posts, etc. I want to partition the tables to optimize performance for each and every one of their sites (specified by "site_id" in every table).

As an example, let's use our images table. Available columns are: image_id (int, primary, auto_inc), site_id (int, primary), width (int), height (int), filesize (int), uploaded (datetime).

I've tried just adding partitioning; ALTER TABLE images PARTITION BY KEY(image_id,site_id), which worked. However, I have no way of knowing what MySQL uses to "group" the partitions - is it optimized for what I want? That's where you guys come in.

Question 1

Is PARTITION BY KEY the correct way to set up a partition to get the site_id "separated" from each other? If not - how would you do it?

Question 2

I've seen that people are manually adding the partitions - p0, p1, p2, p3, etc. Do you need to specify the number of partitions, or does MySQL "understand" that I'm after the site_id?

役に立ちましたか?

解決

Question 1

If you use partition by key(image_id,site_id), you cannot be sure that each partition will only contain separated site_id's because this kind of partitionning will use a integrated hashing function on (image_id,site_id) and the result of this will determinate in wich partition the row will be inserted into.

If you want to ensure separation you should use RANGE or LIST partitionning.

Question 2

If using RANGE or LIST partitionning, you will have to define the number of partitions you want. ie :

PARTITION BY RANGE (site_id) (
PARTITION p0 VALUES LESS THAN (6),
PARTITION p1 VALUES LESS THAN (11),
PARTITION p2 VALUES LESS THAN (16),
PARTITION p3 VALUES LESS THAN (MAXVALUE)
);

LIST and RANGE requires a bit of maintenance. If new site_id are added / removed, you will have to adapt your partition scheme.

KEY partitioning will ensure a balanced row repartition across the number of partitions specified:

PARTITION BY KEY(image_id,site_id)
PARTITIONS 10;

Hope it helps.

他のヒント

MySQL partitions is an OK way to go, but it sounds like you have an ideal case for sharding your database as well. There are easy ways to do it yourself for a simple use case like this, and more automated products that can do it too. This way you are not limited to a single server, you can expand the cluster as you get more usage, and you can even specifically allocate site_id keys to specific servers (giving preference to larger customers). For example, a really big customer can have their own shard server, then lots of smaller customers can be co-located on one or more other servers. If you have shared tables, there are ways to replicate GLOBAL tables across all shards. Parallel queries can be supported if you need to access data across all customers.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top