Unexplained sustained space used in the TempDB log file
-
12-03-2021 - |
Domanda
A few days ago the amount of used space in both the tempdb
data and log files started growing rapidly. After 30 minutes, the tempdb
log remained around 99% space used for about 7 hours. Towards the end of the 7 hour period, I:\Databases\templog.ldf: Operating system error 112(There is not enough space on the disk.) encountered.
was logged followed by about 30 instances of The transaction log for database 'tempdb' is full due to 'ACTIVE_TRANSACTION'
over a 3 second period, and then suddenly the tempdb
log file returned to 1% used space. This usage data is recorded in logs every 3 minutes via a SQL agent job.
I also have a SQL agent job that runs sp_WhoIsActive
every 1 minute and captures the output. I don't see anything that stands out in this data that explains what was consuming or holding space in the tempdb
. There are some maintenance jobs running for several minutes at a time, typically running no longer than 5 or 10 minutes each. Even if one of those jobs or another session used up tempdb
log space, I would expect the space to be reclaimable after any transactions complete or at least after the session ends.
- It's a 200 GB drive dedicated for
tempdb
data and log files. - Both the data and log files grew during this time until there was no more space on disk.
- There's a
tempdb_2.ndf
file I don't recall seeing before (maybe it's always been there...).tempdb.mdf
andtempdb_2.ndf
are the same exact size, around 29 GBs each.templog.ldf
is using 137 GBs. - Running Microsoft SQL Server 2017 (RTM-CU15-GDR) (KB4505225) - 14.0.3192.2 (X64)
I'm trying to understand what happened and how to prevent it again.
- Is it possible for
tempdb
log space to not be reclaimable for hours even if there are no sessions running for anywhere near that long? - What caused
tempdb
log space used to suddenly free up 3 seconds after the above error messages (not enough space on disk andtempdb
transaction log is full)? - Is
sp_WhoIsActive
logging not enough to capture what process was holding onto thetempdb
log space, what other additional logging can I do? - If the
tempdb
log space used was actually reclaimable but a glitch in SQL Server prevented it from clearing out, is there a T-SQL command I can run to try and free up available space in the log?
Any thoughts or suggestions are very appreciated.
Soluzione
Is it possible for
tempdb
log space to not be reclaimable for hours even if there are no sessions running for anywhere near that long?
While it's probably possible, I would say under normal circumstances the answer is no. In the absence of long-running transactions or long-lived sessions, the tempdb
log should clear pretty quickly.
What caused
tempdb
log space used to suddenly free up 3 seconds after the above error messages (not enough space on disk andtempdb
transaction log is full)?
The most likely explanation is that there was a long-running transaction. It finally encountered this error and was killed, which allowed the log space to free up.
Is
sp_WhoIsActive
logging not enough to capture what process was holding onto thetempdb
log space, what other additional logging can I do?
I would expect that to capture it. One thing that might have caused you to miss it is if the process had executed statements in a transaction and then just stopped doing anything (without committing, rolling back, or disconnecting).
This would be the case if someone ran a query in SSMS that used a ton of tempdb
, and then just left that SSMS window open all day. These queries (that are holding resources, but not doing anything actively) won't show up in sp_WhoIsActive
by default. You need to add the @show_sleeping_spids = 2
option to see those. Here's an excerpt from the built-in docs:
If the
tempdb
log space used was actually reclaimable but a glitch in SQL Server prevented it from clearing out, is there a T-SQL command I can run to try and free up available space in the log?
Not that I can think of. You might try issuing a CHECKPOINT
, to force the VLFs to clear:
USE [tempdb];
GO
CHECKPOINT;