Question

I'm using PostgreSQL (8.4) to store data produced by an application making frequent inserts (in the table structure described below).

The database keeps growing with time and, since the newer data is more relevant than the older data (in this particular application), deleting the older rows is a reasonable solution (either based on lower id or older input_datetime, which is more or less the same).

To prevent issues related from this database (the only database running on this server) from affecting the rest of the system, I've put the PostgreSQL data directory on its own partition (ext3, on a Linux system). Nevertheless, when this partition becomes full, this causes a number of problems.

I'm thinking of deleting older data regularly (e.g. DELETE FROM data_group WHERE id <= ... via a cron job) to deal with this.

Firstly, my understanding of VACUUM (as performed by auto-vacuum, which is on) is that, while it doesn't necessarily give back the disk space to the OS (like VACUUM FULL would), it still allows some new data to be inserted within the disk space already used (that is, the DELETEs don't necessarily affect the file size, but they still free space in PostgreSQL's own data structures). Is this correct? (I've noticed VACUUM FULL caused a few problems with the application itself, probably because of the locks it uses.)

If so, it also appears that SELECT pg_database_size('my_database') reflects the size used on disk, which doesn't necessarily reflect what's available for further inserts. Is there another way to estimate how much space is available for new inserts?

In addition, when it's too late and the partition is filled at 100%, running this DELETE statement causes this error and crashes the PostgreSQL service:

PANIC: could not write to file "pg_xlog/xlogtemp.7810": No space left on device

The PostgreSQL daemon stopping is of course a major issue (and there is no other disk to move the cluster to on this machine).

Are there general strategies to prevent this sort of problem from occurring (knowing that disk space is constrained within a given partition, but that it can be acceptable to delete older data)? I would like to automate as much of this as possible, without root or postgres (or PostgreSQL admin) intervention.


CREATE TABLE data_group (
    id SERIAL PRIMARY KEY,
    name TEXT,
    input_datetime TIMESTAMPTZ
);

CREATE TABLE data_item (
    id SERIAL PRIMARY KEY,
    group_id INTEGER NOT NULL REFERENCES data_group(id) ON DELETE CASCADE ON UPDATE CASCADE,
    position INTEGER NOT NULL,
    data BYTEA
);

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top