Question

In a production server (Microsoft SQL Server 2014), the file MSDBData.mdf ballooned up to more than 450 GB, eating almost all the space in the server (only 38 GB remaining).

First thing, I created a backup of MSDB on a network share (via SSMS -> activities -> backup). Then I tried to reduce the size of the file (via SSMS -> activities-> Shrink -> file), but the shrinking is stalled and no disk space has been released. The shrinking dialog window is frozen since hours.

In the activities monitoring of SSMS, in processes monitoring the shrinking appears to be in SUSPENDED status.

enter image description here

In Event viewer there are several events

i.e.

(source MSSQLSERVER, eventid 847) beginning with

 1. Time-out occurred while waiting for latch: class 'FGCB_ADD_REMOVE'
 2. Time-out occurred while waiting for latch: class 'FCB'

The question: is it safe to close the dialog window of SSMS to stop the shrinking of MSDB, in order to carry out further investigation?

Was it helpful?

Solution

Yes, it is safe to cancel that operation (but you need to be patient and let it finish rolling back - do not panic and stop the SQL Server service or reboot the server; all that will do is make the rollback start over).

While you're waiting, let's address your underlying problem.

Outside of a scenario where you manually increase the size to prepare for more data (or just get dibs on space before someone else takes it), data files increase in size because you added data to the file, and it wasn't already big enough to store that data. Telling SQL Server to shrink a file that had to grow to accommodate more data is unlikely to accomplish anything unless you remove some of the data that caused it to grow in the first place.

This could be from a rogue backup job - maybe 3rd party - that is running every second and filling the backup history tables. Or a job run amok that is generating job history and filling that table with errors. Or an infinite loop that is filling database mail tables with junk. Or dozens of other things, especially if you've put user tables for your own processes into msdb.

To find out what is taking up the space, you can use a query like this and another like this (many others exist). It could be that the tables are massive due to fragmentation, or lots of LOB data, or really bad fill factor, or oversized but underpopulated fixed-width columns, or just sheer number of rows. But you're going to have to find this and clear it up before shrinking has any hope of accomplishing anything.

Remove the data you no longer need (and take whatever steps necessary to ensure that whatever process did that is not going to do it again). Rebuild the indexes with a proper fill factor, and set up some kind of maintenance routine (like Ola's scripts) to manage indexes, and possibly an alert on too many file growth events for the msdb database (or all databases), so you can catch this problem earlier. You can see all file growth events that happened recently, for example, using the default trace.

Then, use DBCC SHRINKFILE (not some UI hinkiness in SSMS, or DBCC SHRINKDATABASE, which will also try to shrink the log, and will try to shrink everything at once), a little bit at a time, to bring the file size back down. If the file is currently 450 GB, and you free up 300 GB (you can use sp_spaceused to see the result of the above), I would get to a 200 GB file, let's say (leaving room for future growth), by first shrinking to 425 GB, then 400 GB, then 375 GB, and so on... you may need to do that in smaller chunks, say starting with 2 GB or 5 GB at a time, depending on the capabilities of the underlying disk subsystem.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top