Question

Does Seconds_Behind_Master always show the correct number of seconds the slave is behind the master?

I have also Implemented the mk-heartbeat replication monitoring.

It was showing the other result in comparison with Seconds_Behind_Master.

Which one is more accurate?

Was it helpful?

Solution

Seconds_Behind_Master is based on the difference between UNIX_TIMESTAMP() and the timestamp logged for the query within the binary log of the master, or the relay log on the slave. Seconds_Behind_Master actually gets you lost in context of realtime if replication processes a series of long running queries. The lag could grow astronomically until all relay log entries are processed and then Seconds_Behind_Master will suddenly drop to zero. In terms of realtime, you have no way to knowing or anticipating when the lag will eventaully dissipate until it hits zero.

Strictly using mysql, you can monitor replication lag in terms of realtime in the following manner:

  • From SHOW SLAVE STATUS\G, get two values
    • Relay_Master_Log_File represents log file containing the last successfully execute SQL statement on the Master that was executed on the Slave.
    • Exec_Master_Log_Pos represents the position within Relay_Master_Log_File of the last successfully execute SQL statement on the Master that was executed on the Slave.

You could perform this against that binary log:

TMSTMP=`mysqlbinlog ---start-position=EMLP RMFL | head -50 | grep "^SET TIMESTAMP=" | head -1 | sed 's/=/ /g' | sed 's/\// /g' | awk '{print $3}'`
RIGHTNOW=`date +%s`
(( REPLAG = RIGHTNOW - TMSTMP ))

where EMLP is Exec_Master_Log_Pos RMFL and is the Relay_Master_Log_File

While this is realistically the correct way to get replication lag using only the log file and postition, you must

  • communicate with the slave to get the needed log file and position
  • communicate with the master to retrieve the dump of the binlog
  • dump the binlog each time you check (mysqlbinlog works safely when it it not the currently open log, this may require performing FLUSH LOGS on the master before dumping the master log file)

Getting replication lag from the binary logs yourself takes a lot more leg work.

mk-heartbeat checks for a heartbeat table and record only. Using a live table on a master and comparing the slave's copy of the heartbeat table to UNIX_TIMESTAMP() on the slave is a much more concise realtime lag measurement.

You should go with mk-heartbeat.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top