How to maintain visibility of all new transactions in append-only PostgreSQL DB without scanning the whole table
-
23-02-2021 - |
Frage
Situation
PostgreSQL v11
I have a database with a dozen tables. No rows are ever DELETE
d or UPDATE
d. A bulk of data is INSERT
ed to all the tables in a 'few' (up to 1,000) transactions every day. Some tables can add tens of GBs of data during the INSERT
(the largest has has almost 2 billion rows as of now).
Problem
I have noticed that at some point SELECT
queries I use to read the data from the DB stop using index only scans
. After some digging it became apparent this is due to visibility map
becoming out-of-date. This is confirmed by running VACUUM
as it reverts back to using index only scans
. However, VACUUM
is very expensive in my case (can take over 10 hours for the largest table) and AUTOVACUUM
is never triggered as there are no DELETE
or UPDATE
operations.
I have looked at running VACUUM FREEZE
after each transaction but it seems it will need to scan the whole table after each transaction, which again is going to take ages.
Question
What is the best way to mark all the new transactions as visible for append-only PostgreSQL without scanning the whole table every time?
Lösung
You should run VACUUM (FREEZE)
occasionally. The longer it doesn't run, the more it has to do, and the longer it will take.
To speed up VACUUM
, increase maintenance_work_mem
.
Andere Tipps
What is the best way to mark all the new transactions as visible for append-only PostgreSQL without scanning the whole table every time?
PostgreSQL doesn't have to scan the parts of the table which are already marked as all visible/all frozen. If there are absolutely no obsolete tuples (which for append-only workloads there should not be any, unless some of your INSERTs have rolled back) then it might not have to scan the indexes either. So I don't think the problem you are worried about actually exists.
However, VACUUM is very expensive in my case (can take over 10 hours for the largest table)
How long had you let it go before running that VACUUM? How long did the next one after that one take? There is nothing inherently wrong with a VACUUM taking 10 hours to complete, if that is a problem you should describe what the problem with it is.