Should a base class take responsibility for wrapping incorrectly thrown exceptions that are part of the API documentation? [closed]

https://softwareengineering.stackexchange.com/questions/396896

01-03-2021
|

Вопрос

It is a common approach to use specialized exception types to indicate exceptional results of a method. These exception types are not considered to indicate a programming flaw, but, as I pointed out, are intendend to return exceptional results.

Let's assume the following base class:

public class MyCollection
{
  private readonly Dictionary<string, string> underlyingCollection = new Dictionary<string, string>();

  public void AddEntry(string key, string value)
  {
    this.underlyingCollection.Add(key, value);
    this.OnEntryAdded(key, value);
  }

  /// <summary>
  /// ...
  /// </summary>
  /// <param name="key">...</param>
  /// <returns>...</returns>
  /// <exception cref="KeyNotFoundException">An entry with the provided key could not be found.</exception>
  public string GetValue(string key)
  {
    string value = this.underlyingCollection[key];
    return this.ProcessEntry(key, value);
  }

  protected virtual void OnEntryAdded(string key, string value)
  {
  }

  protected virtual string ProcessEntry(string key, string value) => value;
}

The consumer of the class expects a KeyNotFoundException when GetValue is called with an unknown key, and can react to it. (In Java a checked exception could be used - just ignore the point whether this would be a good idea or not.)

Now someone creates a derived class:

public class CustomizedCollection : MyCollection
{
  private readonly Dictionary<string, int> countByXKeyLookup = new Dictionary<string, int>();

  protected override void OnEntryAdded(string key, string value)
  {
    base.OnEntryAdded(key, value);

    if (key.StartsWith("x", StringComparison.OrdinalIgnoreCase))
    {
      string xKey = ExtractKeyPrefix(key);
      int count;

      if (!this.countByXKeyLookup.TryGetValue(xKey, out count))
      {
        this.countByXKeyLookup.Add(xKey, count);
      }
      else
      {
        this.countByXKeyLookup[xKey] = count + 1;
      }
    }
  }

  // correct implementation
  protected override string ProcessEntry(string key, string value)
  {
    if (!key.StartsWith("x", StringComparison.OrdinalIgnoreCase))
    {
      return value;
    }

    int count = this.countByXKeyLookup[ExtractKeyPrefix(key)];
    return value + " ** X-Key count is " + count;
  }

  private static string ExtractKeyPrefix(string key) => key.Substring(0, 1);
}

The consumer of the class would write this code, and everything works as expected:

var testee = new CustomizedCollection();
testee.AddEntry("A", "Alfa");
testee.AddEntry("XB", "X-Bravo");
testee.AddEntry("XC", "X-Charlie");
testee.AddEntry("YD", "Y-Delta");
string key = null;

try
{
  key = "A"; Console.WriteLine(testee.GetValue(key));
  key = "XB"; Console.WriteLine(testee.GetValue(key));
  key = "XC"; Console.WriteLine(testee.GetValue(key));
  key = "YD"; Console.WriteLine(testee.GetValue(key));
  key = "NA"; Console.WriteLine(testee.GetValue(key));
}
catch (KeyNotFoundException)
{
  Console.WriteLine("Key '" + key + "' not available.");
}

The output is:

Alfa
X-Bravo ** X-Key count is 1
X-Charlie ** X-Key count is 1
Y-Delta
Key 'NA' not available.

Later on, a bug is introduced in method ProcessEntry of the derived class:

// buggy implementation
protected override string ProcessEntry(string key, string value)
{
  // bug here!
  if (!key.StartsWith("x", StringComparison.OrdinalIgnoreCase) && !key.StartsWith("y", StringComparison.OrdinalIgnoreCase))
  {
    return value;
  }

  int count = this.countByXKeyLookup[ExtractKeyPrefix(key)];
  return value + " ** X-Key count is " + count;
}

Now running the code results in a misinterpretation of the exception:

Alfa
X-Bravo ** X-Key count is 1
X-Charlie ** X-Key count is 1
Key 'YD' not available.

But key 'YD' is available!

The root cause of this bug lies in the fact that the consumer of this API relies on the documented behavior that a KeyNotFoundException is returned if the requested key is not found.

Who caused the bug? Well, the developer of the derived class CustomizedCollection introduced a bug that raises a KeyNotFoundException in a completely different context (namely by accessing the private countByXKeyLookup dictionary). This in turn confuses the developer who just consumes the API, because he would not understand why key 'YD' is reported as not existing though it is.

My question is: should the base class take responsibility for filtering out "wrong" exceptions that are part of the API documentation and wrap them into another non-specific exception, e.g. an InvalidOperationException? Or should developers of derived classes be aware of not letting bubble up exceptions with a special meaning that would result in a misinterpretation?

What I do not want to discuss:

whether inheritance is a good idea or not.
whether using a TryDoReturnBool approach is better than throwing exceptions or not.
whether the class should be made sealed or not.
whether aggregation is better than inheritance, and whether aggregation would solve the issue or not.
whether the root cause of an exception can be parsed from the stack trace, and whether this is a good idea or not.

What I want to discuss:

whether it is a good idea to make the base class robust enough to mitigate such programming flaws introduced by developers who derive from the base class.
whether it is a good idea to keep the base class as simple as possible and delegate the responsibility of only throwing allowed exceptions to developers who derive from the base class.

Here is my solution of a more robust but even more complicated implementation (only showing the edited method GetValue):

public class MyCollection
{
  // ...
  // some members have been omitted for brevity
  // ...

  /// <summary>
  ///   ...
  /// </summary>
  /// <param name="key">...</param>
  /// <returns>...</returns>
  /// <exception cref="KeyNotFoundException">An entry with the provided key could not be found.</exception>
  public string GetValue(string key)
  {
    string value;
    ExceptionDispatchInfo approvedEdi;

    try
    {
      this.GetValueImpl(key, out value, out approvedEdi);
    }
    catch (KeyNotFoundException exc)
    {
      throw new InvalidOperationException("Exception caught from customized code.", exc);
    }

    approvedEdi?.Throw();
    return value;
  }

  private void GetValueImpl(string key, out string value, out ExceptionDispatchInfo approvedEdi)
  {
    value = null;
    approvedEdi = null;

    try
    {
      value = this.underlyingCollection[key];
    }
    catch (KeyNotFoundException exc)
    {
      approvedEdi = ExceptionDispatchInfo.Capture(exc);
      return;
    }

    value = this.ProcessEntry(key, value);
  }
}

Original post

Решение

Question:

Should a base class take responsibility for wrapping incorrectly thrown exceptions that are part of the API documentation?

Slightly longer version of the same question:

Should a base class that is open-for-extension take responsibility to mitigate incorrect or misleading exception-throwing behaviors from derived classes and their methods?

Overview

This falls within the larger discussion of "exception guarantees", which are guarantees about some minimal safety and functionality that can still be achieved by a combination of code (library code and application code, or base/framework classes and derived classes/sub-classes) when exceptional behaviors are taken into account. The exceptional behaviors being discussed will cover both intended (correctly implemented and conveyed) and unintended (incorrectly implemented) ones.

Traditional sources of information and interests

... from unmanaged programming languages

In ahead-of-time compiled programming languages such as C++, resource management has to be baked into the source code and the compiled machine code, as opposed to managed programming languages where resource management is solely in the hands of the runtime environment. For this reason, the execution of resource management code in the presence of thrown exceptions (which normally causes the remainder of code to be skipped) is of great interest in unmanaged programming languages like C++.

Professionals who have looked into this matter typically conclude that the interactions are very complicated. Even industry experts have made mistakes when claiming that certain pieces of C++ code were written correctly in the presence of thrown exceptions which were later proven otherwise.

... and from managed programming languages

In managed programming languages, memory leak and data corruption are typically not the biggest concern because the bulk of memory management responsibility lies with the language runtime. The language runtime offers several guarantees even in the face of thrown exceptions.

However, when it comes to other kinds of resources and software operations, such as network resources and database operations, the responsibility once again falls onto the programmers.

... and from traditional engineering.

Exceptional behavior in software is part of the bigger discussion on fail-safe designs in software architecture.

Exception safety guarantees

Main article on Wikipedia: https://en.wikipedia.org/wiki/Exception_safety

Recoverability from failures

Much on the exception safety guarantees focus on exactly what can be expected as a program tries to recover from a failure (where an exception has been thrown, which causes certain code to be skipped).

In some types of application architectures (including some archaic ones), designing for guaranteed exception recoverability is somewhat easier, because in these architectures the software operation is organized into user commands. If a failure occurs somewhere inside a user command, these application architectures default to a rollback of the program state, as if the user command that failed had not occurred.

The exception handling mechanism in today's programming languages have largely benefited these archaic application architectures. There will be a "giant" try-catch block surrounding the user command, such that any exceptions thrown and not caught elsewhere will cause the user command to fail, but there would not be other data losses or errant application behavior.

In many other software applications today, today's exception handling mechanism is insufficient. A typical user command may involve chaining tens to hundreds of back-end operations; any of these could fail; the failure of these have to be handled specifically (differently) rather than generically (uniformly). This inspires the Railway Oriented Programming, which tries to keep the cost of exceptional cases manageable by coalescing.

Importance of testing systems as a whole

When it comes to frameworks, the framework development team is responsible for testing their own code as well as sample extensions that are used to illustrate the intended use of the framework.

Developers of extensions or derived classes are responsible for testing their own code.

Importance of seeing the source code inside frameworks

Sometimes, unexpected interactions between framework and extension code become impossible to troubleshoot because the source code for the framework has been unavailable or obscured (see: Software obfuscation article on Wikipedia). Framework vendors have been keenly aware of this problem, which historically has led to customer frustration and impeded future sales. Framework vendors are increasingly moving toward licensing their source code to their paying customers (while preventing them from releasing said source code to the public), so that paying customers are able to reason about the framework's expectations on exceptions being thrown from extension code.

Observations from code bases in the wild

For small teams and homegrown projects, the base classes and derived classes are typically written by the same programmer. As a result, the programmer has a consistent view and expectations on what use of exception-throwing behavior would be acceptable. While this is nice and efficient use of time, the drawback is that the sole programmer often do not spend time writing down these expectations on the use of exceptions in words. Without trying to write that down, the programmer might not spend much time thinking through it carefully either.

For medium-sized teams, it is often crucial that programmers who are writing derived classes (extensions) can consult with the original programmers who wrote the original base classes. It is expected that all programmers involved have excellent code-reading skills, and can reason about exceptional behavior through complicated code written by different programmers. If necessary, they might need to validate their reasoning through experiments, by injecting exceptions into specific places. Software of this scale often have extensive logging, which will assist in understanding exceptional software behavior, both in development and in production.

For large teams, commercially-sold software frameworks, and operating systems. The risk of less-than-diligent extension code bringing down the system is so huge, such that these large scale software systems will often go the extra mile of intercepting and isolating incorrect behaviors from extension code. In these software systems, during each transition-of-control from framework code to extension code, there will be try-catch blocks around, as well as attempts to protect the original software states so that these states are recoverable in case of corruption. There will also be user interface elements that inform the human user of an error condition, to make it clear that the incorrect behavior likely came from a specific extension code, not from the framework itself.

~~Response to the original question~~

The answers by Doc Brown and ShapeOfMatter both suggest that there is a design issue in addition to the choice of coding preference.

However, I see a different design issue:

I suppose when a typical programmer sees a method named DecorateFilename, one would have some intuition about what it is supposed to do, and what assumptions can be made about it.

For example, its failure might mean that the filename somehow doesn't conform to the string formatting rules that it expected, such that it cannot be decorated.

Now that DecorateFilename is being repurposed to do file name validation (against a list of allowed filenames), it introduces two additional failure modes, which the base class (nor the original programmer who chose the method name for DecorateFilename) didn't expect:

The file where the list of allowed filenames came from might not exist.

The file name being passed into LoadData might have failed this validation check.

I agree that, ideally, these additional failure modes should map to different exception types, just to make the API caller's job of handling them easier. We all know how annoying are misleading UI error messages.

One could shift away failure modes (to other points in time) by shifting away responsibilities. Several possibilities are:

Load the "file where the list of allowed filenames are stored" ahead of time. As several answers and comments pointed out, this file is of an "infrastructure" or "deployment" or "configuration" nature, and these things are typically initialized upfront.

Separate the "filename decoration" and "file loading" responsibilities, so that a new step of "filename validation" can be inserted in between them, and the API caller will know to expect to catch different exceptions from them.

However, nowadays, a lot of programming practices have actually increased this type of inadvertent misleading exceptions being thrown (seen) at unexpected times. Going back to the option of loading the "allowed filenames file" as part of configuration. There is common practice of delay-loading everything until it is actually used. If that were the case, the exception will be thrown when it is actually used, which means our attempts to shift away failure modes will not work.

To summarize, we must accept that, today's programming styles necessarily open us to the possibility that unanticipated failure modes and exceptions will be thrown at us, at unanticipated times (from API methods that we don't think would act in such ways), and that we must anticipate, embrace, and accommodate these possibilities.

Другие советы

You would prevent those misleading exceptions from bubbling up by catching them and converting them to a non-misleading exception.

  public class CustomizedFileManager : MyFileManager
  {
    protected override string DecorateFilename(string filename)
    {
      try
      {
        string decoratedFilename = base.DecorateFilename(filename);
        var allowedNames = File.ReadAllLines("ListOfAllowedFilenames.txt"); // can throw FileNotFoundException

        if (!allowedNames.Contains(decoratedFilename))
        {
          throw new InvalidOperationException("Filename not allowed");
        }

        return decoratedFilename;
      }
      catch (FileNotFoundException e)
      {
        throw new DeploymentException("The file \"ListOfAllowedFilenames.txt\" couldn't be found", e);
      }
    }
  }

The CustomizedFileManager class would know that the FileNotFound exception refers to a deployment problem and not to a problem related to the file specified by the user, so it can do the proper re-interpretation of the exception.

You catch the exception and re-interpret it.

  public class CustomizedFileManager : MyFileManager
  {
    protected override string DecorateFilename(string filename)
    {
      string decoratedFilename = base.DecorateFilename(filename);

      try
      {
          var allowedNames = File.ReadAllLines("ListOfAllowedFilenames.txt"); 
      }
      catch (FileNotFoundException e)
      {
          throw new InvalidOperationException(
              "The Load Data operation could not be executed because " + filename + 
              " could not be found, most likely due to a deployment error.", 
              e);
      }

      if (!allowedNames.Contains(decoratedFilename))
      {
        throw new InvalidOperationException("Filename not allowed");
      }

      return decoratedFilename;
    }
  }

I'm not sure why you think this would be a code smell. If your requirement is that "File Not Found" exceptions not bubble up the stack, then you write code that accomplishes that.

Your Edit 1 just makes it more clear that it is the responsibility of the derived class to determine the appropriate exception to return.

The exception you return should be actionable in some way whether it typically gets ignored, halts the program or triggers something else.

Is there an additional action that a consumer of your collection might want to take when a user requests display of a key that isn't in the 'approved' list vs just requesting the display of a key that isn't in the dictionary?

You would never want consumers to have to parse out an exception message to determine what action to take but it's possible that by throwing a KeyNotInApprovedListException you could allow consumers to trigger a security audit log entry, color a GUI element red or forward the user to somewhere they can request approval. This would be a decision made by the derived class designer and the exception type thrown may be in a library containing the derived class that has use-cases never even considering during design of the base class.

The book Framework Design Guidelines has a chapter on Exceptions which has a specific subsection 7.2.3 on Wrapping Exceptions relevant to your question which points out how exceptions from a lower layer like FileNotFoundException may be completely meaningless if allowed to propagate from a transaction management API to a higher level and that may be a case where you would throw a TransactionFileMissingException

There's numerous ways you could handle this. Three possibilities are:

Focus LoadData on just loading data

Just have the method be:

public string[] LoadData(string filename)
{
    if (filename == null) throw new ArgumentNullException(nameof(filename));

    return File.ReadAllLines(filename);
}

Now that exception will be simply because the file can't be accessed. It becomes the responsibility of another piece of code to determine filename.

Block DecorateFilename from throwing a FileNotFoundException

public string[] LoadData(string filename)
{
    if (filename == null) throw new ArgumentNullException(nameof(filename));

    string decoratedFilename = null;
    try
    {
        decoratedFilename = DecorateFilename(filename);
    }
    catch (FileNotFoundException ex)
    {
        throw new InvalidOperationException(InnerException: ex);
    }

    return File.ReadAllLines(decoratedFilename);
}

So now, an InvalidOperationException is thrown if DecorateFilename has a problem.

Use the "try pattern" to avoid exceptions

public string[] LoadData(string filename)
{
    if (filename == null) throw new ArgumentNullException(nameof(filename));

    if (!TryDecorateFilename(filename, out var decoratedFilename))
    {
        throw new InvalidOperationException();
    }

    return File.ReadAllLines(decoratedFilename);
}

So TryDecorateFilename doesn't throw an exception; it returns false if it fails.

I don't think we could make any reasonable approach toward a general solution to the problem you're describing. This is the best I can do:

The potential for a "misleading exception" is suggestive of an overly complicated or malformed solution.

Parsing exceptions in-code should be a last resort.

In the example you give, we should be concerned both by the fact that our file-reading code can throw a FileNotFound exception because of a missing config file, and that we're relying on a FileNotFound exception to detect if a file exists.

It may be possible to put the list of allowed filenames in-code, which would be fine, or in more general-purpose config file who's presence and validity would already have been checked. If that's not appropriate, then the allowed-filename config file should be read in the CustomizedFileManager constructor (which might take the name of that config file as an argument).
If code higher in the stack wants to know if a file exists, it should ask if it exists, not wait for an exception. We violate this often, because sometimes it's not worth it to do everything right, but that doesn't change what the right way is.

Designing classes correctly for allowing inheritance is often harder than it looks at a first glance. In this case, one issue is already hidden in the comment

/// <exception cref="FileNotFoundException">File <paramref name="filename" /> has not been found.
/// </exception>

which makes this an informal contract of the public API of the class. A user of the class will expect this exception representing exactly this kind of failure, nothing else.

Together with a protected function which allows derivations to throw arbitrary exceptions, makes it easy for derived classes to violate the Liskov Substition Principle - which means, when a derivation tries to use exceptions to communicate errors to the caller, they easily risk not obeying the original contract of the class.

I see different ways of dealing with this:

Simply don't allow inheritance, make the original class sealed and change the protected method to "private". That way, the user of the class is forced to use a solution similar to the one sketched by DavidArno as #1 - add the additional file name check at a different part in code, with its own error handling. This solves the problem by giving a piece of code less responsibilities. Putting the responsibilities of file loading, automatic file renaming, and automatic validating against a list of allowed files all in one function makes it harder to separate the results of the different steps, especially in case of an error.
Change the comment - make it clear the caller can get all different kinds of exceptions. So if they get a FileNotFound exception, they should always evaluate or utilize the message text (which in this contains actually a statement which file precisely was missing, I tried it out). They should also deal with InvalidOperationExceptions or other types. This opens the possibility for derived classes to use different exceptions for different problems without violating the LSP (like shown in Bart van Ingen Schenau's answer and Robert Harvey's answer).
Enforce derived classes to obey the LSP (this means, make sure they do not throw any exception of a different type or different semantics than what is described in the comment). Unfortunately, since the contract was only defined as a comment, this can only be enforced by code reviews, and in this case it makes it almost impossible for a derivation to make use of proper error handling.

So the root cause of the problem lies IMHO already in the design of the base class, and trying to solve it afterwards in the derived classes, without changing the base class, can only be a trade-off.

Лицензировано под: CC-BY-SA с атрибуция

Не связан с softwareengineering.stackexchange