Question

I have a server with 150GB of diskspace. Today I uploaded a dataset of 30GB. I cancelled the import due to internet dying, then noticed there was 29GB of space missing in the database (meaning the CSV was uploaded, but not deleted when I broke the operation). When uploading the data once again, it broke again and I lost another ~25GB. Now there isn't enough free space to upload the data.

This is hosted on AWS RDS, Postgres 10.6.

Is there a way to fix this? I read about VACUUM. But will this delete records? I'm hosting at the moment ~70GB of data and don't want to lose any records. What's the best way to go about this?

Was it helpful?

Solution

PostgreSQL leaves dead data in the table; the space can be reused, but the files won't shrink (significantly).

The official method to reclaim space is VACUUM (FULL), but that rewrites the whole table, which will be unavailable for any access during that time. There are extensions called pg_squeeze and pg_repack which do the same thing with less disruption "behind the scenes".

All these methods have in common that they will require enough free space to create a copy of the table, so you probably won't get around increasing the storage space anyway.

Now for the good news:

If you run a plain VACUUM on the table, which is not disruptive, the wasted space can be reused. So your next attempt to load the data won't increase the size of the table.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top