Question

I have an application that sends emails on scheduled times. Sometimes the appliction gets stuck while sending an email, I'm still not sure why.

I thought about implementing a simple watchdog like this: Before the application starts sending an email, it initializes a new instance of the watchdog. This instance starts a one time timer. If the task completes OK - we let the watchdog know it should stand down, and it cancels its timer. If the timer defined period had elapsed - we exit the program forcefully.

I'm not sure if this is a valid solution or more like a hack, and would appreciate any input regarding this subject.

Thanks, Omer

Was it helpful?

Solution

It isn't such a bad idea IMHO.

The main pitfall I see here isn't technical, but a humane one: once the watchdog is operational and does it's thing properly, and the customers don't complain anymore, it is way too easy to just say "problem solved!" and forget about the original issue (tossing it to the backlog at best, marking it as "resolved" at worst).

On the technical side:

You may want to consider the isolation level of the watchdog and the application, and how violent the watchdog's action is. The minimal isolation is having the watchdog run on a different thread (the one-time timer will do that). It might be better to have the email mechanism and the watchdog run on different AppDomains, so the watchdog unloads the entire "email AppDomain" on timeout. This gives you a similar resolution to killing the process (at least under a "managed" point-of-view), but is less violent than killing the process and starting it again.

You should also consider race conditions: the watchdog's timer and the email sending process are racing, which could end up in killing the process after an email has been successfully sent, which could further result in the same email being sent once again as you restart the application (which leads to poor customer experience).

As the commentators say, I'd strongly advise you to debug the issue. You'll need to work with production-debugging tools and instruments such as tracing, logging, production-time debuggers (like WinDbg), etc., which allow you to diagnose and debug a non-reproducible issue.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top