postgresql - how and why indexes are bigger than their tables
-
26-09-2020 - |
Question
I'm using postgresql 9.3 and trying to understand how and why indexes are bigger than their tables.
Sample output:
database_name | database_size | table_name | table_size | indexes_size | total_size
---------------+---------------+--------------------------------------------------------------+------------+--------------+------------
foo_12345 | 412 MB | "foobar_dev_12345"."fact_mobile_sends" | 57 MB | 131 MB | 189 MB
foo_12345 | 412 MB | "foobar_dev_12345"."fact_mobile_started" | 17 MB | 39 MB | 56 MB
foo_12345 | 412 MB | "foobar_dev_12345"."fact_mobile_stopped" | 16 MB | 35 MB | 51 MB
I'm running the following query to get the table and index sizes.
SELECT
table_catalog AS database_name,
pg_size_pretty(pg_database_size(current_database())) As database_size,
table_name,
pg_size_pretty(table_size) AS table_size,
pg_size_pretty(indexes_size) AS indexes_size,
pg_size_pretty(total_size) AS total_size
FROM (
SELECT
table_catalog,
pg_database_size(current_database()) AS database_size,
table_name,
pg_table_size(table_name) AS table_size,
pg_indexes_size(table_name) AS indexes_size,
pg_total_relation_size(table_name) AS total_size
FROM (
SELECT ('"' || table_schema || '"."' || table_name || '"') AS table_name, table_catalog
FROM information_schema.tables
) AS all_tables
ORDER BY total_size DESC
) AS pretty_sizes;
Is my query correct? What would cause indexes to be bigger?
Solution
Possible reasons:
Numerous and probably overlapping indexes on the table; have a look with
\d
Bloat due to high update churn can sometimes affect indexes more than tables, depending on update patterns. Examine the size of each individual index to see if it makes sense.
GiST indexes, if used, can be quite large
Unlike what I originally thought this is not an issue with TOAST
out-of-line storage not being counted, since pg_table_size
includes TOAST
tables.
Note that if you're concerned about index bloat and decide to REINDEX
some of all of the involved indexes, consider setting a non-default FILLFACTOR
first if the table is subject to lots of updates (or inserts+deletes). Otherwise you'll take a write performance hit because the index doesn't have any space to insert new values so it'll force lots of page splits and be less efficiently structured.