سؤال

I'm trying to figure out various things surrounding PostgreSQL and how backups should work together with WAL and Commvault Simpana. Simpana is telling me that everything is ok, but leaves files lying around in the WAL Archive directory.

Let the journey begin.

Environment

PostgreSQL & OS Version

PostgreSQL 9.3 is running on an Ubuntu 14.04.3 LTS server.

Postgres WAL Config

The postgres.conf file is set as follows for WAL archiving.

#------------------------------------------------------------------------------
# WRITE AHEAD LOG
#------------------------------------------------------------------------------
# - Settings -
#wal_level = minimal                    # minimal, archive, or hot_standby
wal_level = archive

[...]

# - Archiving -
archive_mode = on
#archive_mode = off             # allows archiving to be done
                                # (change requires restart)
archive_command = 'cp %p /pgsql-backup/archive/postgres1/%f'
                                # command to use to archive a logfile segment
                                # archive_command = ''           
                                # command to use to archive a logfile segment
                                # placeholders: %p = path of file to archive
                                #               %f = file name only
                                # e.g. 'test ! -f /mnt/server/archivedir/%f    && cp %p /mnt/server/archivedir/%f'
#archive_timeout = 0            # force a logfile segment switch after this
                                # number of seconds; 0 disables

If the test ... part is left in the archive_command it breaks the Simpana backup, which is why we omitted it.

The above configuration should result in the WAL files being copied from the /pg_xlog/ directory to the /pgsql-backup/archive/postgres1/ directory, when ...

  1. no longer required, because of a pg_basebackup
  2. a WAL file is full (default 16MB) and no longer in use

Commvault Simpana

The client computer has been configured so that the PostgreSQL databases/instance and WAL files in the Archive Log directory are backed up. The WAL files should be deleted when no longer required, because the Simpana option 'Delete Archive' has been set for the PostgreSQL client.

Expected Behaviour

Because Simpana is performing the backup with PostgreSQL native commands, I expect that when Simpana has completed a full backup or a WAL backup, that the files in the /pgsql-backup/archive/postgres1/ directory are deleted.

Effective Behaviour

When I check the /pgsql-backup/archive/postgres1/ directory after Simpana has performed a backup, there will be one more file in the directory with a 0000000300000037000000nn.mmmmmmmm.backup syntax. This is a hint that Simpana is performing a backup using native PostgreSQL commands, because a 0000000300000037000000nn.mmmmmmmm.backup file would only be created after a PostgreSQL backup of the database/instance using pg_basebackup. This is just my conclusion after reading the documentation for PostgreSQL 9.3.

Here is an example of the content of the directory:

[...]
00000003000000370000007A
00000003000000370000007B.00000028.backup
000000030000003700000091.00000028.backup
000000030000003700000093.00000028.backup
000000030000003700000095.00000028.backup
000000030000003700000097.00000028.backup
000000030000003700000099.00000028.backup
00000003000000370000009B.00000028.backup

PostgreSQL Documentation

The official documentation states that

To make use of the backup, you will need to keep all the WAL segment files generated during and after the file system backup. To aid you in doing this, the base backup process creates a backup history file that is immediately stored into the WAL archive area. This file is named after the first WAL segment file that you need for the file system backup. For example, if the starting WAL file is 0000000100001234000055CD the backup history file will be named something like 0000000100001234000055CD.007C9330.backup. (The second part of the file name stands for an exact position within the WAL file, and can ordinarily be ignored.) Once you have safely archived the file system backup and the WAL segment files used during the backup (as specified in the backup history file), all archived WAL segments with names numerically less are no longer needed to recover the file system backup and can be deleted. However, you should consider keeping several backup sets to be absolutely certain that you can recover your data.

This undermines my conclusion that Simpana is using the native PostgreSQL commands to backup the database/instance and its WAL Archive Log files in the directory /pgsql-backup/archive/postgres1/.

According to the documentation a nnnnnnnnnnnnnnnnnnnnnnn.mmmmmmmm.backup file is a pointer to the earliest WAL file required for a roll forward recovery to succeed. Any older WAL files in the Archive Log directory could be deleted and are no longer required.

What amazes me, is that there is a WAL file in the Archive Log directory without a corresponding *.mmmmmmmm.backup pointer file.

Questions

  1. If I weren't using Simpana for the backup, who would (have to) delete the *.mmmmmmmm.backup files in the WAL Archive directory? The pg_archivecleanup command?
  2. Why is there still a full WAL file in the Archive Log directory, when it should have been deleted like all the other WAL files in the Archive Log directory?
  3. Why isn't there a 00000003000000370000007A.mmmmmmmm.backup file for the still existing 00000003000000370000007A WAL file in the archive log directory?

I'm looking forward to your responses and hope that somebody has a similar configuration of Simpana and PostgreSQL somewhere out there.

هل كانت مفيدة؟

المحلول

This seems to be fundamentally a question about Commvault Simpana, not PostgreSQL. As Commvault seems to be commercial software, you might have the best luck contacting their support desk.

Expected Behaviour Because Simpana is performing the backup with PostgreSQL native commands, I expect that when Simpana has completed a full backup or a WAL backup, that the files in the /pgsql-backup/archive/postgres1/ directory are deleted.

I don't know what "a WAL backup" means here. Is that a terminology specific to Simpana? Does it just mean that the WAL files in your original archive directory have been copied to some off-site storage?

Questions If I weren't using Simpana for the backup, who would (have to) delete the *.mmmmmmmm.backup files in the WAL Archive directory? The pg_archivecleanup command?

If you weren't using Simpana, then you would be using something else. We can't tell you what that something else would be--there are lots of choices. While pg_archivecleanup is one such method, it is starting to look pretty obsolete these days. If you only want to keep your WAL files long enough for them to be safely stored (or replayed) on a stand-by, you would now use "streaming replication", and thereby do away with log shipping altogether.

Or you could have a policy to permanently keep the first base backup ever taken (immediately after you initialized your empty database), and every WAL file archived since then, so that you can do point-in-time-recovery to anytime in the history of your database.

Why is there still a full WAL file in the Archive Log directory, when it should have been deleted like all the other WAL files in the Archive Log directory?

It looks to me like when Simpana decides to clean up the archive, instead of removing all WAL files older than the oldest one currently needed, it instead deletes the range of files starting with the one still needed last time it did a clean up, ending at the one just before the one currently needed.

If this is the case, then if a WAL file was archived by PostgreSQL right after you turned on archiving, but before Simpana had been activated (or before it had gotten its bearings) then that file will never be removed.

Why isn't there a 00000003000000370000007A.mmmmmmmm.backup file for the still existing 00000003000000370000007A WAL file in the archive log directory?

If no backup was initiated during the period that 00000003000000370000007A was the active WAL file, then there never would have been a 00000003000000370000007A.mmmmmmmm.backup file in the first place.

نصائح أخرى

We have the same issue with Simpana postgres backup. The documentation states:

What are all the files that get backed up as part of Log Only backups? Only the transaction log files that are relevant to the current backup cycle are backed up. The entire Log directory content will not be backed up. For example, if you run a log backup at time T1 and later a full backup (data and log) at time T2; the transaction log files generated between T1 and T2 will not be part of the latest full backup as they are not relevant for the current backup cycle. Also, .backup files, which are temporary files created by PostgreSQL Server as part of log archiving, are not backed up as part of log backups. How does log file cleanup happen when Archive Delete option is enabled?

How does log file cleanup happen when Archive Delete option is enabled? With the Archive Delete option enabled, only those transaction log files that are backed up as part of log backup are deleted. The transaction log files that were not deleted during the earlier backup attempts are not cleaned up. Also, due to continuous archiving of transaction log files, you may see new transaction log files in the WAL directory after successful log backups. Therefore, the transaction log files that are not part of any log backup as well as the temporary files created by PostgreSQL Server will be left behind in the Log directory.

So: If you dont make an extra log-only backup, just before the next full backup; All these wals never get backupped/deleted; and worse: You can never PITR to this time period in case of disk failure.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى dba.stackexchange
scroll top