Postgres requested WAL segment has already been removed (however it is actually in the slave's directory)

dba.stackexchange https://dba.stackexchange.com/questions/158776

Pergunta

I am using repmgr as my replication tool. On the slave I keep getting an error:

requested WAL segment has already been removed

When I check the Master indeed it is not there; however, it is in the slave's directories both in pg_xlogs and pg_xlogs/archive_status. I can't understand why it would be looking for this file if it's already in the slave?

In fact it has xlogs going past the requested one. The solutions in What to do with WAL files for Postgres Slave reset are for a slightly different problem. They seem to be for a scenario where the master deletes a log file before the slave receives it. In my case it is very much present on the slave and several other files in the sequence after the one being requested.

This also tells me I do not need to increase the keep wal segments option as it didn't seem to fall behind?

Foi útil?

Solução

I simply took a gamble and copied the one file back to the master which was missing. It worked i still don't know what caused the issue.

Outras dicas

From Streaming Replication in the PostgreSQL documentation:

If you use streaming replication without file-based continuous archiving, the server might recycle old WAL segments before the standby has received them. If this occurs, the standby will need to be reinitialized from a new base backup. You can avoid this by setting wal_keep_segments to a value large enough to ensure that WAL segments are not recycled too early, or by configuring a replication slot for the standby. If you set up a WAL archive that's accessible from the standby, these solutions are not required, since the standby can always use the archive to catch up provided it retains enough segments.

To fix the issue, you have to reinitialize the data from primary server. Remove data directory on slave:

root@replica:~# su postgres
postgres@replica:~# mv /var/lib/postgresql/12/main /var/lib/postgresql/12/main_old

Copy all data from the primary server:

sudo -u postgres pg_basebackup -h [PRIMARY_IP] -D /var/lib/postgresql/12/main -U replication -P -v

if version is 12, Create the standby.signal file, otherwise configure replica.conf:

touch /var/lib/postgresql/12/main/standby.signal

Slave configuration:

listen_addresses = 'localhost,[IP_ADDRESS_OF_REPLIACA_ON_LAN]'              # what IP address(es) to listen on; 
max_connections = 100 # Ensure that this value is the same as the primary's
wal_level = 'replica'
archive_mode = on
archive_command = 'cd .'
primary_conninfo = 'host=[PRIMARY_IP] port=5432 user=replication password=[REPLICATION PASSWORD]'
hot_standby = on

max_wal_senders = 48

How long is the pg_basebackup taking? Remember that segments are generated about every 5 minutes, so if the backup takes an hour, you need at least 12 segments stored. At 2 hours, you need 24 etc., I'd set the value to about 12.2 segments/hour of backup.

https://www.gab.lc/articles/postgresql-12-replication/

Licenciado em: CC-BY-SA com atribuição
Não afiliado a dba.stackexchange
scroll top