Question

I'm using NServiceBus (v 4.0.5) on an Azure virtual machine using the Azure Service Bus transport (v 4.0.5). The NServiceBus.Host service has been crashing on an occasional basis but lately has been crashing more often than not. The exception thrown is:

Application: NServiceBus.Host.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: Microsoft.ServiceBus.Common.CallbackException
Stack:
at Microsoft.ServiceBus.Common.Fx+IOCompletionThunk.UnhandledExceptionFrame(UInt32, UInt32, System.Threading.NativeOverlapped*)
at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32, UInt32, System.Threading.NativeOverlapped*)

I'm using dedicated machine running the generic host service, and I have 3 machines which send messages to it (I don't use pub/sub).

What I've tried

  • Rebooting / restarting the service manually.
  • Researching the error: not many people seem to have received this message, and for the people that have, their response did not apply to my situation.
  • Verifying the dead letter queue: several messages are placed in the dead letter queue (over 400 in the past 6 months), but I could not correlate any specific message types to the crash (at least 40% of my message types have been found in the dead letter queue). I'm assuming that most of these messages have been added to the DLQ because the service is failing.
  • Checking application logs: my application logs exceptions to a log4net log, however no exceptions were logged during the time of the crashes.
  • Checking event logs: nothing relevant was found except for the main error message noted above.
  • Upgrading NServiceBus to 4.4.2 and WindowsAzureServiceBus package to 5.1.1: due to NuGet package conflicts upgrading is proving to be painful. I'm using Microsoft.Data.OData 5.4.0 and Microsoft.Data.Edm 5.4.0, but the NServiceBus.Azure package depends on v5.2.0 of these assemblies. I could discard the nuget package dependencies and add the references myself, but I'd like to know why the WindowsAzureServiceBus package depends specifically on v5.2.0 before doing this.

Any thoughts or ideas would be helpful.

Thank you!

Was it helpful?

Solution

I will look into this, It sounds like a bug, most likely an unhandled exception coming from the azure servicebus (but doesn't necessarily originate there)

I've created a github issue here: https://github.com/Particular/NServiceBus.Azure/issues/133

Are you able to reproduce the issue? And what has changed between the time where you saw it occasionally and where it happens often.

One thing you could do is to add an eventhandler for all exceptions occuring on the appdomain and log those as well, that should theorethically catch anything and if there is an innerexception to this callback exception you could catch it this way.

On the strict dependency of the packages. This is mostly done because nuget package manager does not apply binding redirects to the app.config of worker roles, which tripped up way to many users in the past (it often manifests itself as an infinitly rebooting worker role). So go ahead and override.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top