Question

I want to use pt-online-schema-change to change the schema of a big table (~100M records), does this tool effect the performance of MySql while its running?

Was it helpful?

Solution

By design, the tool will have no significant effect on performance. First, let's review what the tool does:

  • attach triggers to the current table, to copy all updates, deletes and inserts to the new table.
  • copy existing data over in chunks, partitioned by the key

The first part is going to double all writes and there's no way around this. The second part is a batch operation that is going to potentially lock the current table and use up a lot of IO.

Fortunately, the second part is split into chunks and pt-online-schema-change is quite clever about how big the chunks are and how long it waits between chunks:

  • it checks slave replication between chunks, and pauses if the lag is too great. it is able to recursively check for slaves.
  • it checks load (typically measured by number of running threads) and pauses if there are too many queries running (which implies lock contention or high CPU/IO usage). it can even abort if the load is extremely high.
  • it configures InnoDB lock settings such that it is most likely to be the victim of any lock contention, so production queries will run smoothly.
  • by default, the chunk size is dynamically changed to keep its runtime consistent, using a weighted average of previous chunk runtimes.
  • chunks that are too big (e.g. due to a huge number of rows with the same key) will be skipped over.

Due to this, it is likely that your server will only be slightly affected by the copy. But of course, there is no guarantee and if possible, you should run the tool on a staging version of the database. In the event of issues, you can safely abort the tool with no loss of data.

OTHER TIPS

The online schema change is pretty intense, while running all inserts/updates/deletes to the original table are doubled (the tool adds triggers so these actions are duplicated on the new copy), and piece by piece every row of the original table is copied over.

If your concerned about what effects this may have on your database, check out the doc: http://www.percona.com/doc/percona-toolkit/2.2/pt-online-schema-change.html

There are several ways you can have the tool throttle itself, including checking thread count, replication lag, and modifying how large each chunk is/how long each chunk should take.

It will read and write stuff to disk, consume memory and use CPU, so yes, it will affect performance while running. How could it be otherwise?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top