Question

I have a classic master-slaver PG (version 10) architecture. Despite the fact the parameter wal_keep_segments is set to 200, the pg_wal directory on the standby server is not purged and keeps filling in. Do you have any ideas ? For information, I don't have this issue on the primary server.

Both primary and standby have the same configuration.

On the master:

select * from pg_stat_replication ;
-[ RECORD 1 ]----+----------------------------------------------
pid              | 5399
usesysid         | 16387
usename          | replication
application_name | walreceiver
client_addr      | XXXXXXXXXXXX
client_hostname  | XXXXXXXXXXXX
client_port      | 56780
backend_start    | 2018-11-05 10:18:50.280663+00
backend_xmin     |
state            | streaming
sent_lsn         | 71/E3000000
write_lsn        | 71/E3000000
flush_lsn        | 71/E3000000
replay_lsn       | 71/E3000000
write_lag        |
flush_lag        |
replay_lag       |
sync_priority    | 0
sync_state       | async
-[ RECORD 2 ]----+----------------------------------------------
pid              | 10175
usesysid         | 16389
usename          | barman_replication
application_name | barman_receive_wal
client_addr      | XXXXXXXXXXXX
client_hostname  | XXXXXXXXXXXX
client_port      | 42572
backend_start    | 2018-11-12 03:09:03.715933+00
backend_xmin     |
state            | streaming
sent_lsn         | 71/E3000000
write_lsn        | 71/E3000000
flush_lsn        | 71/E3000000
replay_lsn       |
write_lag        | 00:00:02.516016
flush_lag        | 00:00:02.516016
replay_lag       | 06:28:11.482478
sync_priority    | 0
sync_state       | async

On the standby:

select * from pg_replication_slots ;
-[ RECORD 1 ]-------+------------
slot_name           | barman
plugin              |
slot_type           | physical
datoid              |
database            |
temporary           | f
active              | f
active_pid          |
xmin                |
catalog_xmin        |
restart_lsn         | 16/9F000000
confirmed_flush_lsn |
Was it helpful?

Solution

The replication slot should not exist on the standby, only on the master. Unless you are using cascading replication, which you don't seem to be.

If you have a replication slot on the standby but no one is connecting to it to read from and advance it, that explains the retention.

See https://www.postgresql.org/docs/9.6/continuous-archiving.html#BACKUP-BASE-BACKUP:

It is often a good idea to also omit from the backup the files within the cluster's pg_replslot/ directory, so that replication slots that exist on the master do not become part of the backup. Otherwise, the subsequent use of the backup to create a standby may result in indefinite retention of WAL files on the standby

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top