Question

At my company we occasionally import large customer MySQL databases (40GB+) which can take over a day to load on our developer machines. While we accept this load time since it's done in the background, we lack any solid ability to estimate when the imports will finish. This blocks us from scheduling an appropriate time to act on it. It's like waiting for the cable guy to show up.

Right now my best strategy is a quick show tables command to see what percentage of the tables have been loaded. However, as table sizes vary greatly both with each other and with each customer, this isn't even close to reliable.

Does anyone have a good technique or tool that can be used to get a reliable percentage of how far along a MySQL import is?

Was it helpful?

Solution

You can do it with the pv command by piping the dump to mysql.

pv -i 1 -p -t -e /path/to/sql/dump | mysql -u USERNAME -p DATABASE_NAME

It will show you a progress bar during the import progress based on the IO throughput. (As seen here.)

OTHER TIPS

HeidiSql will tell you how many GB have loaded so far, which is pretty helpful when trying to figure out how much still needs to be loaded.

You can also do a show processlist on the source database to see how far (by auto incrementing ID) into a specific table the export is. Eric makes a good point in that you can monitor the size of the database data directory as compared to the size of the source.

Something I have found useful in speeding up dumps/imports, is to do it by tables and then run multiple mysqldumps simultaneously essentially multi-threading your process. I usually did about 4 separate dump/restores at a time. The optimal amount will depend on your hardware and disk capabilities.

A very simplistic example to give you an idea:

mysqldump dbname table1 table2 table3 | mysql -h host &
mysqldump dbname table4 table5 table6 | mysql -h host &
mysqldump dbname table7 table8 table9 | mysql -h host &
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top