Question

What is an InnoDB Page checksum exactly? Does a Page checksum only detect issues with underlying storage when writing or reading to a page/block?

Was it helpful?

Solution

(I think this is what you are asking about.)

InnoDB "blocks" are 16KB. But most disk subsystems work in smaller units -- 4KB or 512 bytes. For InnoDB data to remain intact, the disk needs to write all 16KB as a unit -- either all written or none. What can happen is that the pieces of the 16KB are written one after another, and the power fails part way through. This causes a "torn page" ("page" referring to the 16KB block).

To recover from a torn page, InnoDB does two things. It checksums each block in order to discover it, and it uses a "double write" for recovery. The block (or at least certain critical blocks) are written twice - to some relatively constant spot, then to the desired location on disk (in the data or index).

When recovering from a crash, the "double write buffer" is checked to see if it leads to a "torn page"; in which case, it is repaired.

The double write, as its name implies, is costly. (I have no metrics on how costly; I suspect the cost depends heavily on HDD vs SSD and RAID controllers.) Turning it off is a way to gain some more speed, but with risk. FusionIO was (it's been bought out) the only drive manufacturer who guaranteed 16KB atomic writes; I hope others have added this feature.

RAID with Battery Backed Write Cache should make the double write virtually zero cost.

A different checksum... The Percona Toolkit uses a "checksum" for data. Since rows are not necessarily laid out identically between Master and Slave, checksumming the files is not useful to seeing if the tables match. I suspect it involves reading the rows in a repeatable order, and checksumming each row or clump of rows.

"rsync" does a similar thing, but at the file level.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top