Ola Hallengren DataBaseBackup to URL results in un-equal file sizes causing IO Device Error

https://dba.stackexchange.com/questions/284501

15-03-2021
|

Question

we have been using Ola's Maintenance Solution and Database backup direct to URL since we migrated our system to Azure IAAS. We have multi terabyte DB's (approx 40% Filestream data) and therefore split the backups over multiple files to keep them below 195GB each. We had no issues until the last couple of weeks where the Full backup started to fail with an IO Device Error, telling me that the file size was greater than the 195GB limit so I increased the @NumberOfFiles parameter. The first week this worked, the next week it failed again even though the DB had only grown a few GB in size. I ended up having to add 10 to the number of files parameter and the backup completed successfully.

The issue is, most files written to are approx 90GB and 4 are approx 180GB. Does anyone know the reason for the un-equal file sizes and a way to prevent this?

EXECUTE [dbo].[DatabaseBackup]
@Databases = '<dbname>',
@URL = '<storage account>',
@BackupType = 'FULL',
@Compress = 'Y',
@Verify = 'Y',
@CheckSum = 'Y',
@LogToTable = 'Y',
@MaxTransferSize = 4194304,
@Blocksize = 65536,
@NumberOfFiles = 36

Thanks in advance

Solution

All Ola's procedure does is to call the BACKUP command. I.e., the fact that you are using Ola's procedure is irrelevant. If you want to determine this with 100% certainty, then just pick up the backup command executed (should be in the CommandLog table methinks, or worst case using a trace) and execute this directly (possibly using an Agent job).

My guess is that you are unfortunate combined with using backup compression. I.e., some backup thread just happen to run into data that doesn't compress well, and because of that the size of that file becomes larger than the other thread's files.

You can verify this by turning off compression, and see that you get a much more even file size. I understand that this might not be practical because of the database size, but perhaps you have a smaller database with same symptom for which you can do this test? I've asked around and we'll see if I can get a confirmation on my theory.

I tried this myself and did indeed see a bigger file size variation with compression.

FWIW, in 7.0 (when striping was introduced) the backup thread pushed as much data into the destination (file or tape) as the destination could consume. This gave best backup throughput, but uneven file size if one tape device (or file destination) happened to be much slower than the other. The algorithm changed in 2000 to spread the data evenly across the backup files. My assumption is that this determination of how the data is to be spread is done before compression.

SQL Server pre-allocates the backup file for performance reasons. When you use compression it has to guess what size you end up with. You can end up in a situation where it tries to pre-allocate more than what it would eventually need, causing the backup to fail. Trace flag 3042 changes this. It causes SQL Server to allocate storage on-the-fly. I very much doubt this will change anything for you, but thought I'd mention this so you can be aware of what the trace flag does and don't get your hopes up too much in case you read about the trace flag and decide to test it.

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange