Postgres: Reclaiming Space
-
15-02-2021 - |
Question
I have a server with 150GB of diskspace. Today I uploaded a dataset of 30GB. I cancelled the import due to internet dying, then noticed there was 29GB of space missing in the database (meaning the CSV was uploaded, but not deleted when I broke the operation). When uploading the data once again, it broke again and I lost another ~25GB. Now there isn't enough free space to upload the data.
This is hosted on AWS RDS, Postgres 10.6.
Is there a way to fix this? I read about VACUUM
. But will this delete records? I'm hosting at the moment ~70GB of data and don't want to lose any records. What's the best way to go about this?
Solution
PostgreSQL leaves dead data in the table; the space can be reused, but the files won't shrink (significantly).
The official method to reclaim space is VACUUM (FULL)
, but that rewrites the whole table, which will be unavailable for any access during that time. There are extensions called pg_squeeze
and pg_repack
which do the same thing with less disruption "behind the scenes".
All these methods have in common that they will require enough free space to create a copy of the table, so you probably won't get around increasing the storage space anyway.
Now for the good news:
If you run a plain VACUUM
on the table, which is not disruptive, the wasted space can be reused. So your next attempt to load the data won't increase the size of the table.