Question

I am using an older version of Postgres (8.4.20). I know that the autovacuum process is executed frequently to free disk space of queries that deleted or updated data in tables. I have a database which does not have frequent deletes or updates.

Does it take less time and memory to process autovacuum on such a database, or does it solely depend on the size and number of objects in the database?

Was it helpful?

Solution

First things first — 8.4 is no longer supported, so consider upgrading.

Autovacuum settings are documented.

Let's focus on the settings that affects when autovacuum kicks in. As you might know, this process is responsible for both, vacuuming and analyzing tables.

One of the settings affecting ANALYZE frequency is autovacuum_analyze_threshold. As you can read from the manual, this parameter specifies the minimal amount of rows that should be changed in order for ANALYZE to be triggered. This serves well for the small tables, but on big ones and/or tables with high activity this will lead to a way too frequent analyzing. In order to avoid this, another parameter exists, namely autovacuum_analyze_scale_factor. It specifies the fraction of the table to add to the threshold in order to check whether ANALYZE should kick in.

Say, we have a table with 10,000 rows and 200 of them had changed:

  1. autovacuum_analyze_threshold tells us, that we're over the default 50;
  2. we calculate the fraction based on autovacuum_analyze_scale_factor (which defaults to 0.1) and this gives us 1000 rows;
  3. total calculated threshold therefore is 1050;
  4. as 200 is less then 1050, ANALYZE is not started (we wait for more changes).

For the VACCUM there's another pair of parameters with completely similar behavior: autovacuum_vacuum_threshold and autovacuum_vacuum_scale_factor, except default scale for vacuuming is 0.2 or 20%.

Now, as you can guess, the bigger your table becomes, the more time it takes to trigger the autovacuum on it. Therefore, for bigger tables (typically over 1M rows) it is highly recommended to adjust these settings. You can do this on a per-tables basis using ALTER TABLE ... SET ( storage_parameter = ... ) syntax.

It might be tempting to set *_scale_factor to 0 and increase only the threshold for the bigger tables. Still, it better to keep factor to a small, but non-zero value anyway, as for a table with high activity 100,000 rows change might be occurring too frequently, causing unnecessary autovacuuming. See this thread in the pgsql-performance list.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top