How to split a MySQL table file?

https://dba.stackexchange.com/questions/97677

20-12-2020
|

Вопрос

I have a table in a MySQL database for which innodb_file_per_table is enabled. One of my tables is about 20GB and growing. How can I partition the table into separate files?

Suppose that I have a table named tab1 that is stored on /var/lib/mysql/db_name/tab1.frm and the size is about 20GB. I want to partition that table to split into 2 files tab1-1.frm and tab1-2.frm each one sized at 10GB.

Table definition:

CREATE TABLE cdr (
     gid bigint(20) NOT NULL AUTO_INCREMENT,
     id bigint(20) NOT NULL,
     start datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
     clid varchar(80) NOT NULL DEFAULT '',
     src varchar(80) NOT NULL DEFAULT '',
     channel varchar(80) NOT NULL DEFAULT '',
     duration int(11) NOT NULL DEFAULT '0',
     uniqueid varchar(32) NOT NULL DEFAULT '',
     dnid varchar(20) NOT NULL DEFAULT '',
     service varchar(100) NOT NULL DEFAULT '',
     cost int(11) NOT NULL DEFAULT '0',
     PRIMARY KEY (gid),
     UNIQUE KEY id (id,prefix),
     KEY start (start),
     KEY clid (clid),
     KEY service (service)
) ENGINE=InnoDB AUTO_INCREMENT=37024605 DEFAULT CHARSET=latin1

Решение

I assume it is the .ibd file that is 20GB, not the .frm?

If you want two physically separate tables you could copy tab1-1 into tab1-2 (so you have two copies) then Delete half the data from each. You'd still need to Optimize both tables afterwards to shrink them.

Using the MySQL partition option you would have one table tab1, which you would partition by some range (e.g. all ID's below 10000 in partition 1, 10001 - 20000 in partition 2 etc.). This would then show up on your filesystem as two (or more) separate files (in the form of tab1.frm and various files like tab1#p#partition1.ibd, tab1#p#partition2.ibd). It wouldn't give you control over the filesizes however.

In the above example, if all your ID's were below 10000 for instance, it would make no difference to your filesizes.

If you provide the table structure, and mysql version it would help.

UPDATE: 2015-04-13 11:44 (GMT+1):

Looking at your table structure, the first thing I see is that you are going to have problems because you have a Primary key and a Unique Key.

All columns used in the partitioning expression for a partitioned table must be part of every unique key that the table may have.

In other words, every unique key on the table must use every column in the table's partitioning expression. (This also includes the table's primary key, since it is by definition a unique key.

http://dev.mysql.com/doc/refman/5.6/en/partitioning-limitations-partitioning-keys-unique-keys.html

At present, any attempt at partitioning will result in an error:

e.g.

Error Code: 1503. A PRIMARY KEY must include all columns in the table's partitioning function

Error Code: 1503. A UNIQUE INDEX must include all columns in the table's partitioning function

I would recomend looking at some of the following first:

Also, test this on a non-live server, as it may well take quite some time to build the partitions, which you wouldn't want to do on a Live database.

UPDATE: 2015-04-27 09:52 (GMT+1):

It seems you have some decisions to make. . .

In order to use the Start column, you would need to add start to both the Primary Key (PK) and the Unique Key.

    ALTER TABLE `test`.`cdr` 
    DROP PRIMARY KEY
    ,DROP INDEX `id`
    ,ADD PRIMARY KEY (`gid`,`start`)
    ,ADD UNIQUE INDEX `id` (`id` ASC, `prefix` ASC, `start` ASC);

The affect on your PK would be negligible (as long as you DON'T write directly to it - which would defeat the point of the auto-increment anyway).

It would however, potentially cause a problem for your Unique Key, as it would now only be unique across id,prefix,start.

e.g. insert into test.cdr (id,prefix,start) values (100,'abc','2015-04-27 09:19:00'),(100,'abc','2015-04-27 09:19:01');

will become valid, whereas previously it would have generated a duplicate key error.

so you would need to consider how important that uniqueness is.

If it IS very important, you may want to consider partitioning by id instead, but that depends on what your data looks like as to how easy that would be (and using id,prefix as your PK, and converting gid to a standard (not unique) key.

If it isn't very important, then you could convert the Unique Key to a standard key, and make gid,start your PK.

Другие советы

You can add the date to the PK and replace the unique key with a key. Then you have to handle the unique check in the application.

Another solution is to add for example a Year column derived from start, then you add Year to PK and unique key. Then it should work to partition by year.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с dba.stackexchange