When autovaccum will analyze?
-
14-03-2021 - |
Question
Let's suppose that a table foo
became eligible for autovacuum analyze, e.g. I have inserted a number of rows into foo
that exceeds autovacuum_analyze_scale_factor * number of rows + autovacuum_analyze_threshold
effective threshold. I can't figure out which are the conditions that should be satisfied in order to have autovacuum performing the analyze on foo
.
Will autovacuum analyze run for table foo
only if there are no transactions? Or it can run even during active transactions with the conditions that: (1) no transaction is reading a previous version of table foo
and (2) the autovacuum analyze can gain SHARE UPDATE EXCLUSIVE
lock on table foo
?
These questions are related to particular case. I peform a significant number of INSERTs into foo
in one transaction and then some UPDATEs on foo
in a second transaction. I need the autovacuum analyze to run before the 2nd transaction in order to have an up to date statistics of foo
for my query planner better estimates. Are there a way to guarantee that the autovacuum analyze will run before the 2nd transaction? Maybe to sleep (I'm running both transactions from Java app) couple of milliseconds between transactions?
Solution
The formula is correct.
The number of live rows is taken from the reltuples
column in pg_class
, and the result of the formula is compared to n_mod_since_analyze
from pg_stat_all_tables
.
Note that autovacuum_analyze_scale_factor
and autovacuum_analyze_threshold
can be overridden by storage parameters on th table.
If all available autovacuum workers are running (autovacuum_max_workers
), you may have to wait.
Analyze only takes an ACCESS SHARE
lock on the table (it only reads), so it can run concurrently with almost anything. Like a SELECT
statement, it will see the rows that are visible to it.
If you need a table to be analyzed between an INSERT
and an UPDATE
, don't wait for autoanalyze to run. Explicitly start ANALYZE
on the table.