Shrinking data file to save disk space before migration

https://dba.stackexchange.com/questions/286047

16-03-2021
|

Question

I know this is most discussed answered on various forums: I completely understand answer is no in most of the time or cases.

However want to know if it can be good approach under any scenario:

Lets say for current prod DB which in our case is over 30 TB and have retention of data for 1 year. Tables are mostly compressed and partitioned.

Development team came up with logic where they dont need data to stay upto year and want to change retention to 6 months. So it is expected data would be reduced to half after purge and just having 6 months data going forward.

This is where we have been requested to work on shrinking data files so that those LUNS with TB of space could be returned or if we migrate to new sql version we would be requesting lets say few (approx10) TB's less which do save some money, assuming we keep 5Tb extra for growth in database in future.

In this case would shrink help if we go with shrinking in small chunks assuming in low peak time blocking may be accepted or if shrink is run for longer duration can be canceled?

How much worth is the above shrink data file activity or is there a better approach?

Edit --> We are migrating to latest version possibly SQL 2017or SQL2K19. The tables are not partitioned based on filegroups, as for this databases all the 30+ files are on primary. I know, its just a vendor DB design and dont have much in my hand.

Solution

That is a pretty hefty database to shrink. I'd just like to point out two things you want to check for, before you start your shrink operations:

LOB data. This will take forever, since there are no backpointers for the LOB data. I.e., a LOB page was moved and all SQL Server know is what table it belongs to. It has to do a table scan to find the row, one for each lob value on that page.

Heap tables. For each page moved, SQL Server has to modify each nc index for every row that is on that page. It can easily take 5-10 times longer to shrink a heap table compared to a clustered table.

One option you might want to consider for that size is to move data somewhere else (filegroup) before the shrink (or possibly just remove that filegroup if it is empty now).

OTHER TIPS

you can read about this here - https://www.brentozar.com/archive/2017/12/whats-bad-shrinking-databases-dbcc-shrinkdatabase/

and about when shrinking is a must what can you do here - https://www.brentozar.com/archive/2020/07/what-if-you-really-do-need-to-shrink-a-database/

When we shrink our database, it introduces both external and internal fragmentation, it causes blocking, it causes transaction log growth while it runs. Index fragmentation goes high. When we rebuild the index, the database size goes back up. Hence causing the endless cycle of shrink-rebuild.

Assuming you have the downtime in place, and if I must do a shrink to move the DB, I would script out all indexes and drop them. run checkpoint multiple times, take log backup, Shrink log file then shrink the database, take full backup and restore this backup to new location. run the index script to create them again and let it grow in new server. Caution: creating index on a 30TB is going to be a pain.

There could be a better way of doing this, I would be happy to keep an eye on other answers.

One possibility would be to use Transactional replication and add static row filters to filter out the older data, this would also have the advantage of a shorter cut over time to the new server (I'm assuming you're doing a side by side migration not an in place upgrade).

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange