Row estimate more accurate for primary key column than for others?

https://dba.stackexchange.com/questions/284170

14-03-2021
|

Question

Let's create two test tables in a PostgreSQL 13 database:

CREATE TABLE foo (
  id bigint GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
  value int NOT NULL
);
CREATE TABLE bar (
  id bigint PRIMARY KEY,
  category_id bigint NOT NULL,
  foo_id bigint REFERENCES foo (id),
  value int
);
CREATE INDEX bar_category_id_ix ON bar (category_id);

and disable autovacuum for these tables:

ALTER TABLE foo SET (autovacuum_enabled = false);
ALTER TABLE bar SET (autovacuum_enabled = false);

Insert 500000 (half of a million) records into foo, transfer them to bar and analyze the tables:

INSERT INTO foo (value) SELECT * FROM generate_series(1, 500000);
ANALYZE foo;
INSERT INTO bar (id, category_id, foo_id, value) SELECT id, 1, id, value FROM foo WHERE value <= 500000;
ANALYZE bar;

Optionally ensure that only ANALYZE (no autovacuum) was performed on these tables:

SELECT relname, last_autovacuum, last_vacuum, last_autoanalyze, last_analyze FROM pg_stat_user_tables WHERE relname IN ('foo', 'bar');

Insert another chunk of 500000 records (but don't run ANALYZE):

INSERT INTO foo (value) SELECT * FROM generate_series(500001, 1000000);
INSERT INTO bar (id, category_id, foo_id, value) SELECT id, 2, id, value FROM foo WHERE value > 500000;

Since we did not run ANALYZE table statistics is outdated, its related to the stage when foo and bar contained half of million records. Now let's check the query plans:

EXPLAIN SELECT * FROM bar
JOIN foo ON bar.foo_id = foo.id
WHERE category_id = 2;
----
Nested Loop  (cost=0.85..12.89 rows=1 width=40)
  ->  Index Scan using bar_category_id_ix on bar  (cost=0.42..4.44 rows=1 width=28)
        Index Cond: (category_id = 2)
  ->  Index Scan using foo_pkey on foo  (cost=0.42..8.44 rows=1 width=12)
        Index Cond: (id = bar.foo_id)

and

EXPLAIN SELECT * FROM bar
JOIN foo ON bar.foo_id = foo.id;
---
Hash Join  (cost=32789.00..71320.29 rows=999864 width=40)
  Hash Cond: (bar.foo_id = foo.id)
  ->  Seq Scan on bar  (cost=0.00..17351.64 rows=999864 width=28)
  ->  Hash  (cost=15406.00..15406.00 rows=1000000 width=12)
        ->  Seq Scan on foo  (cost=0.00..15406.00 rows=1000000 width=12)

I understand that 1st query plan has wrongly estimated only 1 row (rows=1) for condition category_id = 2 because the statistic is outdated (the ANALYZE was performed before inserting records with category_id = 2). (1) But then, how did the 2nd query plan arrive at a good estimation (rows=999864) for condition bar.foo_id = foo.id?

Also if we run:

EXPLAIN SELECT * FROM bar
JOIN foo ON bar.foo_id = foo.id
WHERE category_id = 1;
----
Hash Join  (cost=32789.00..73819.95 rows=999864 width=40)
  Hash Cond: (bar.foo_id = foo.id)
  ->  Seq Scan on bar  (cost=0.00..19851.30 rows=999864 width=28)
        Filter: (category_id = 1)
  ->  Hash  (cost=15406.00..15406.00 rows=1000000 width=12)
        ->  Seq Scan on foo  (cost=0.00..15406.00 rows=1000000 width=12)

(2) Why does the planner estimate 999864 rows for condition category_id = 1? The statistics should show about 500000 of rows satisfying it?

NOTE: I came to these questions because empirically I observed that conditions containing only primary key columns will produce a better query plan even if the table was not analyzed, but I did not find anything about this behavior in PostgreSQL official documentation.

Solution

The "magic" you observed lies in this detail of the query planner. Quoting the manual:

In any case, the planner will scale the values it finds in pg_class to match the current physical table size, thus obtaining a closer approximation.

The values it in pg_class are reltuples and relpages - the number of live rows and data pages on disk.

Since the physical size will have approximately doubled, Postgres will expect approximately that many rows, which explains the pretty accurate estimate for the sequential scan.

Value frequencies in pg_statistic are outdated and cannot be scaled as trivially as the row count. You'd need to run ANALYZE for that. Explains the estimate for the index scan.

I don't see how the PK would play any special role in this.

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange