I need to sort a huge MyISAM table in a certain way so that SELECTs are faster under certain conditions.
Please note that this question is about how to do an ALTER table with zero downtime for SORTING the table in some specific column order. It is not a dup question of other questions asking about the more general case.
A simple way to achieve this is doing something like this:
ALTER table mytable ORDER BY col1, col2;
We may also use myisamchk --sort-records to achieve the same result.
In any case, both approaches let us to do very quick:
SELECT * WHERE col1=x order by col2;
Note that this is not a problem with the index but with fetching large amounts of ordered data from the table.
So far that ALTER has been working well. The problem now is that the ALTER command is slow and it locks the DB.
I believe we may use percona or openark tools for doing the same operation. Something like this:
pt-online-schema-change --alter "ENGINE=MyISAM, ORDER BY col1, col2" D=mydatabase,t=mytable -u root --dry-run
This internally creates a new table copies it and then moves names. It is pretty well documented.
However I'm not sure if/how percona will honour the "ORDER BY". I cannot see anything happening in the dry-run logs (but this may be normal). And this is not explained in the documentation.
Does anyone know how will percona ORDER BY the table?
- Will it do the ordering on the new table (_mytable_new) after mytable is copied and before renaming?
- Will it do the ordering during the copy from mytable as in "INSERT INTO _mytable_new SELECT * FROM mytable ORDER BY col1, col2"?
- Or perhaps "ORDER BY" will never be done?
EDIT: I launched PTDEBUG=1 ./pt-online-schema-change --alter "ENGINE=MyISAM, ORDER BY col1, col2"
on the tests server.
After checking the logs I found out that "ORDER BY" is not being applied... Any ideas? Does openark permit to do so?
Thanks!