PostgreSQL Long VACUUM

https://stackoverflow.com/questions/433737

10-07-2019
|

Question

I am currently cleaning up a table with 2 indexes and 250 million active rows and approximately as many dead rows (or more). I issued the command VACCUM FULL ANALYSE from my client computer (laptop) to my server. It has been going about its business for the last 3-4 days or so; I am wondering if it will end anytime soon for I have much work to do!

The server has a quad-code Xeon 2.66 GHz processor, 12 GB or RAM and a RAID controller attached to 2 x 10K rpm 146 GB SAS HDs in a RAID 1 configuration; it is running Suse Linux. I am wondering...

Now, firstly the VACUUM postmaster process seems to be making use of only one core. Secondly, I am not seeing a very high I/O writes to I/O idle time ratio. Thirdly, from calling procinfo, I can extrapolate that the VACUUM process spends most of its time (88%) waiting for I/0.

So why isn't it utilizing more cores through threads in order to overload the RAID controller (get high I/O writes to idle ratio)? Why is it waiting for I/O if the I/O load isn't high? Why is it not going faster with all this power/resources at its fingers? It seems to me that VACUUM can and should be multithreaded, especially if it is working on a huge table and it is the only one working!

Also, is their a way to configure postgresql.conf to let it multithread such VACUUMs? Can I kill it and still benefit from its partial clean-up? I need to work on that table.

[I am using PostgreSQL 8.1]

Thx again

Solution

You don't say what version of PostgreSQL you are using. Is it possible it is pre-8.0?

I had this exact same scenario. Your best best:

kill the vacuum
back up the table with pg_dump -t option
drop the table
restore the table

If you are using 8.x, look at the autovacuum options. Vacuum is single threaded, there's nothing you can do to make it use multiple threads.

OTHER TIPS

Some quick tips:

Run VACUUM FULL VERBOSE so you can se what is going on.
Drop all indexes before the VACUUM. It's faster to rebuild them than vacuum them. You also need to rebuild them now and then because VACUUM FULL isn't good enough (especially on such an old PosgreSQL as 8.1).
Set the maintenance_work_mem really high.
Use a newer PostgreSQL. Btw, 8.4 will have an huge improvement in vacuuming.

An alternative to VACUUM is to dump and restore.

Edit: Since 9.0 VACUUM FULL rewrites the whole table. It's basically the same thing as doing a dump + restore, so running REINDEX is unnecessary.

Are you sure you don't have anything ongoing that could lock tables and prevent vacuum from running ?

(Anyway, it's best to use vacuum_cost_delay so that vacuum is not disruptive to production.)

Old VACUUM FULL is a fossil. It's pretty slow too, and you got to REINDEX afterwards. Don't use it. If you really want to defrag a table, use CLUSTER, or this :

Lettssay you have some disk space left, that's much faster than dump&reload :

CREATE TABLE newtable AS SELECT * FROM oldtable;
CREATE INDEX bla ON newtable( ... );
ALTER TABLE oldtable RENAME TO archive;
ALTER TABLE newtable RENAME TO oldtable;

Note this will not copy your constraints. You can use CREATE TABLE LIKE ... to copy them.

So why isn't it utilizing more cores through threads

pg doesn't support this.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow