Question

I have a very simple setup: 1 master with 2 slaves. I stopped replication on one of the slaves (with the slave mysqld still running) and changed a row on the master. I have also kept all slaves and the master running and just changed data on the slave.

I ran pt-table-checksum --databases=db_main --user=user --pass='pass' but do not get any differences.

I read the documentation a few times, but can't seem to find what I might be missing.

Extra info: the storage engine is InnoDB, the master is on MySQL and slaves on MariaDB. Full replication with slaves set as read-only. Permissions are correct.

I want to make sure the tool is working as expected. We had a slave stop running and I ran pt-table-checksum. It told me there was no difference in data (which was obviously false). I want to make sure pt-table-checksum is working as expected, but I have not been able to create a test where pt-table-checksum actually detects a difference.

Just now I changed data on the running slave (so data is different in slave and master). When I run pt-table-checksum on that database, I don't see any "diffs". Because it is a mature tool, I assume I am doing something wrong, but can't figure out what it is.

Was it helpful?

Solution

When you stop a slave the master has no idea what its slaves are. pt-table-checksum doesn't see your slaves.

pt-table-checksum is a mature tool and the most popular in the Percona toolkit. The data inconsistency is easy to reproduce. Just write something on a slave. By the way, there is a vagrant master->slave configuration in https://github.com/twindb/twindb_table_compare, you can test pt-table-checksum on it.

By default pt-table-checksum connects to localhost. Do you run it on the master? Can you connect to the slave with mysql client with same user/password?


There were two issues preventing the check from working correctly:

  1. When replication is stopped on a slave, the master will not see it as a slave (and therefore not check it).
  2. Permissions allowed pt-table-checksum to connect to the master, but not to the slaves (since that account was restricted to localhost). However, the tool will not show permission denied. Instead it just ignores the slave.

OTHER TIPS

If I am not mistaken, the tool uses the replication stream to perform the test. That is, it checksums things on the Master and sends it to the slave for comparison. This is probably why you could not see the diffs in your test with replication stopped -- the checksums had not been yet sent to the slave; once they were there, the change had also made it there.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top