Question

So, I'm currently trying to wrap my head around worker threads for availability groups, more specifically, the number of threads used by each secondary database. The information I've read is different to what I'm seeing. For example:

I'm using the sys.dm_hadr_db_threads and sys.dm_hadr_ag_threads DMVs.

"Each Secondary replica uses 1 redo thread for each secondary database" - but my secondary replica databases are using either 3 or 4 threads:

Secondary replica threads

It's a very inactive AG so I wouldn't expect to see any active threads (again from what I've read, after 15 seconds or so, the threads are released for re-use) I don't see any active threads on the primary which is what I would expect:

Primary replica threads

It's not causing an issue (yet), it's just something I'm curious about. Have I mis-understood how it works? I just don't want to it be over consuming threads as eventually there's going to be quite a large number of databases added into the AG, and I'm just trying to figure out if we'll have enough worker threads for the AG and daily database usage.

Config:

SQL Server 2019 1 AG 1 primary 1 secondary

Was it helpful?

Solution

So I believe what you read is dated information specifically pertaining to the first version of Availability Groups when it was released with SQL Server 2012. I found what I'm guessing is the RedGate Article written in 2014 that mentions that quote "Each Secondary replica uses 1 redo thread for each secondary database". This was known as the serial redo model.

The source article from Microsoft which that RedGate article references has since been updated (in 2020) and no longer states that's how the secondary replica works.

Furthermore I found the following Microsoft article which mentions how they've switched to multiple parallel redo worker threads per database by default, aka parallel redo model, starting with SQL Server 2014. This must be what you're experiencing.

When availability group was initially released with SQL Server 2012, the transaction log redo was handled by a single redo thread for each database in an AG secondary replica. This redo model is also called as serial redo . In SQL Server 2016, the redo model was enhanced with multiple parallel redo worker threads per database to share the redo workload.

This is a feature that is probably not often tweaked, so I would personally recommend leaving it as is until you run into a problem you're sure is related to the parallel redo model. If you do run into an issue, then it is possible to switch back to the original serial redo model as the above article states:

To switch to serial redo model, TF 3459 needs to be enabled. But after the SQL Server instance is running in serial redo model, the only way to change it back to parallel redo is to restart the SQL Server service.

Nice question by the way. :)

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top