Question

I want to learn how to create truly robust applications in .net - ones that are fault tolerant and are capable of withstanding unexpected situations. Where can I find literature/guidance on this subject? So far, I am not having much luck.

Was it helpful?

Solution

I'm aware of at least a couple resources. First, there's a very useful article on MSDN titled Keep Your Code Running with the Reliability Features of the .NET Framework.

Chris Brumme also had a post on hosting when the reliability features were being designed. This can provide some useful background information

Search terms that you may find useful include "High Availability" "CER" and "Constrained Execution Regions".

Good luck! Truly available code is pretty tricky stuff. :)

OTHER TIPS

If you are looking from a software implementation perspective, then it may be worth looking into Design by Contract (DbC)

According to this source, the benefits of Design by Contract include the following:

  • A better understanding of the object-oriented method and, more generally, of software construction.
  • A systematic approach to building bug-free object-oriented systems.
  • An effective framework for debugging, testing and, more generally, quality assurance.
  • A method for documenting software components.
  • Better understanding and control of the inheritance mechanism.
  • A technique for dealing with abnormal cases, leading to a safe and effective language construct for exception handling.

In addition, I would advocate looking into a Test Driven Development (TDD) approach, which should help drive out a more robust design.

Personally I found Stephen Toub's article to be the best source regarding constrained execution regions: Using the Reliability Features of the .NET Framework. And in the end CER are the bread and butter of any fault tolerant code so this article contains pretty much everything you need to know, explained in a clear and concise way.

That being said you could rather choose to favor a more radical design where you immediately resort to the destruction of the application domain (or depend on this pattern when the CLR is hosted). You could look at the bulkhead pattern for example (and maybe the reactive manifesto if you're interested on this pattern and facing complex data flows).

That being said, the "let it fail" approach can backfire if you cannot completely recover after that, as demonstrated by Ariane V.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top