Question

MySQL's database size can be calculated with

SELECT table_schema AS db_name, SUM(data_length + index_length) AS size
FROM INFORMATION_SCHEMA.TABLES
WHERE table_schema != 'INFORMATION_SCHEMA'
GROUP BY db_name

But you can also calculate the database size by looking at the disk-usage of the data-directory. Should these two numbers ever deviate? Is there ever a reason to use one of these methods of the other? By extension, how is INFORMATION_SCHEMA.TABLES's data_length and index_length calculated?

Was it helpful?

Solution

Well, it seems like there are two ways the deviate in performance,

  • Statistics when you query with innodb_stats_on_metadata (the default) you force the statistics to be updated (if it's not cached). You can turn this off, but doing so requires you make a GLOBAL change. From the docs:

    When innodb_stats_on_metadata is enabled, InnoDB updates non-persistent statistics when metadata statements such as SHOW TABLE STATUS or when accessing the INFORMATION_SCHEMA.TABLES or INFORMATION_SCHEMA.STATISTICS tables. (These updates are similar to what happens for ANALYZE TABLE.) When disabled, InnoDB does not update statistics during these operations. Leaving the setting disabled can improve access speed for schemas that have a large number of tables or indexes. It can also improve the stability of execution plans for queries that involve InnoDB tables.

    To change the setting, issue the statement SET GLOBAL innodb_stats_on_metadata=mode, where mode is either ON or OFF (or 1 or 0). Changing the setting requires privileges sufficient to set global system variables (see Section 5.1.9.1, “System Variable Privileges”) and immediately affects the operation of all connections.

  • Cache else it seems that the INFORMATION_SCHEMA is reading from a cache in Table_statistics::get_stat. That seems to be documented under information_schema_stats_expiry. But it also seems you can refresh the cache with ANALYZE table (which the docs mention), and the testing suite. So the cache can drift by that many seconds.

I don't see anything else documented or in the code that would leave me to believe there was a difference.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top