MySQL and correctly setting up partitions

Question 1

Question 1

If you use partition by key(image_id,site_id), you cannot be sure that each partition will only contain separated site_id's because this kind of partitionning will use a integrated hashing function on (image_id,site_id) and the result of this will determinate in wich partition the row will be inserted into.

If you want to ensure separation you should use RANGE or LIST partitionning.

Question 2

If using RANGE or LIST partitionning, you will have to define the number of partitions you want. ie :

PARTITION BY RANGE (site_id) (
PARTITION p0 VALUES LESS THAN (6),
PARTITION p1 VALUES LESS THAN (11),
PARTITION p2 VALUES LESS THAN (16),
PARTITION p3 VALUES LESS THAN (MAXVALUE)
);

LIST and RANGE requires a bit of maintenance. If new site_id are added / removed, you will have to adapt your partition scheme.

KEY partitioning will ensure a balanced row repartition across the number of partitions specified:

PARTITION BY KEY(image_id,site_id)
PARTITIONS 10;

Hope it helps.

Question 2

MySQL partitions is an OK way to go, but it sounds like you have an ideal case for sharding your database as well. There are easy ways to do it yourself for a simple use case like this, and more automated products that can do it too. This way you are not limited to a single server, you can expand the cluster as you get more usage, and you can even specifically allocate site_id keys to specific servers (giving preference to larger customers). For example, a really big customer can have their own shard server, then lots of smaller customers can be co-located on one or more other servers. If you have shared tables, there are ways to replicate GLOBAL tables across all shards. Parallel queries can be supported if you need to access data across all customers.