Throwing exception from constructor?

https://softwareengineering.stackexchange.com/questions/411117

12-03-2021
|

Domanda

I have this NamePath class that accepts a string from some system path and tries to split it into two properties, Name and LastName.

Is the code example below fine? I've read that throwing exceptions in constructor is okay, but is there a better way for this case?

public class NamePath{
   public string Name {get; private set; }
   public string LastName { get; private set; }

   public NamePath(string path){   
       Parse(path);
   }
   private void Parse(string path){
       if ( ParsingOkay() ) {
          // set the properties
       } else {
          throw new Exception
       }
   }

}

I've thought of another approach where NamePath is just data class, and I would have a static method on another class that would try to create NamePath object or else it would return null.

public class NamePath {
   public string Name {get; private set; }
   public string LastName { get; private set; }

   public NamePath (string name, string lastName){
       Name = name;
       LastName = lastName;
   }
}

public class PathHelper {
    public static NamePath ParseNamePath (string path){
        if ( ParsingOkay() ){
           return new NamePath
        } else {
           return null
        }
    }
}

code example is semi-pseudo

Soluzione

Throwing exceptions in constructors in C# is fine, but a constructor should always create a valid object. I prefer to keep construction devoid of parsing. Parsing is inherently tricky, because you cannot trust the source of the data. Exceptions should be expected when parsing anything. The .NET stack already has a pattern to deal with this: Parse and TryParse.

While a monad is more idiomatic across languages, the [Try]Parse pattern is a long standing idiom for C#.

The constructor always takes data in the necessary pieces, and throws ArgumentXExceptions in the case invalid arguments are passed. The parse methods can be static methods to handle the parsing either by creating a valid object or throwing a FormatException.

var path = new NamePath(name, lastName); // throws Argument*Exception

try
{
    var path = NamePath.Parse("...");

    // Use properly constructed path
}
catch (FormatException)
{
    // Parse errors
}

if (NamePath.TryParse("...", out NamePath path))
{
    // User properly constructed NamePath as path variable
}
else
{
    // Parse errors
}

This gives code authors some options with how to construct, parse and recover from errors. You'll see this pattern reflected in the native .NET types DateTime, int, decimal, etc. Your custom NamePath type will feel more natural to deal with along side the built-in types:

if (int.TryParse("...", out int age))
{
    // valid int
}
else
{
    // invalid int
}

if (NamePath.TryParse("...", out NamePath path))
{
    // valid NamePath
}
else
{
    // invalid NamePath
}

Altri suggerimenti

Let us assume the caller doesn't need more information than if the parsing failed or not (for whatever reason). Then functionally both of your examples are mostly equivalent. The question you need to ask here then is: how would you prefer to use the feature for constructing NamePath? Would you prefer to write

   try
   {
      var namePath = new NamePath(path);
      // some code using namePath
      // ...
   }
   catch(SomeException ex)
   {
       // do some error handling here
   }

Or would you prefer to write

      var namePath = PathHelper.ParseNamePath(path);
      if(namePath==null)
      {
           // some error handling here
      }
      // some code using namePath

If the error handling in the second case is always to throw an exception, then it should be clear that the first variant would make more sense and become more consise.

Note in both cases, if one forgets the error handling, in the first case the outer scope may get a more specific exception, in the second case the outer scope may get a "null reference exception".

Nevertheless, both variants can be used to write correctly working code, and they don't have a huge difference in maintainability and readability. Hence, it stays to be a judgement call.

Words like "OK" or "better" are meaningless unless you can define what those terms mean for your specific software application. The way you do that is you ask: "why?"

OK. So why would you throw from a constructor?

In these situations, it can be useful to determine what your possible options are, and then evaluate each one.

You can throw, or
You can allow the constructor to return an incomplete object.

That's about it, if you want to use a constructor. If you can use something other than a constructor, like a factory method, you can return a more sophisticated object like a Maybe Monad. Maybe that's your "better" way.

But let's say you're not there yet. You just want to know whether or not to throw in a constructor.

The way you choose between those two options is to determine if there's more than one way for the constructor call to fail. Because if there's more than one way for it to fail, you're not going to know which way it failed if you don't throw. because that approach doesn't transmit any specific information about the failure, and you're left with an invalid object without knowing why it is invalid.

And this is why it's desirable to throw in a constructor. When you throw, you can communicate information about why the constructor call failed, and you prevent incomplete objects from being created.

Using a factory method to return null on construction failure does prevent incomplete objects from being created, but communicates no information about why the failure occurred.

I should probably mention that, if your constructor throws exceptions, you're going to have significant performance problems if you do something like parse a file in a tight loop using objects created via constructor calls. Exceptions are expensive, and your parsing operation will grind to a halt if the file you are parsing is not well-formed.

All exceptions are not created equal. Exceptions need to be exceptional and your code needs to be concise, readable, and you need to be able to reason about your code.

Of course, you can throw anything you like in your constructors. Or you can avoid throwing. The question is... do you have a plan? Writing code is, in a way, like writing a story. You have infinite options, dozens of which are almost identical or equivalent. Why am I saying that? Because when you think of your code as a part of reality, you can exploit your mental capacity of grasping concepts and ideas from real life and translate them to code. And then you can be more productive because you think in patterns.

Allow me to suggest a plan: Consumers of your code should be able to skip the instructions to the farthest extent possible.

The only information you can safely convey to consumers of your constructors, and be sure that they will understand it, know about it, and even expect it, is that it puts the pieces together and ensures structural integrity, i.e. if the constructor is successful, the instance of the object is ready for use!

Be extra careful that you understand exactly what "structural" means. Structural integrity is what you have when things are in place and they are what they say they are. Having a null value instead of a valid instance spoils structural integrity. Having an empty string does not inherently do so because it's still a valid string. You can obviously choose to rule empty or whitespace strings out (as the BCL team decided when they made the string.IsNullOrEmpty and string.IsNullOrWhiteSpace methods, because empty and whitespace strings are often as good as null values) but that's that. All other string instances still don't inherently spoil structural integrity. Additional rules about what you want to count as a valid string is information you just cannot cleanly convey through a constructor.

Apart from this information, there is nothing, really, that you can convey to the consumers of your constructors without actually getting them to read the instructions, your instructions. So, in order to spare them the instructions, you can simply avoid doing anything more specialized in a constructor. So, to keep constructors trivial to execute, you have to:

Only throw ArgumentNullExceptions and ArgumentExceptions by checking for null or empty strings.
Only make assignments to members.
Make sure any clever tricks you do in there don't really matter much.

This is really just the principle of least astonishment at work. When I retrieve a property of an object, I don't expect it to take more than a trivial amount of time. Same goes for a constructor, otherwise, the consumer will be caught by surprise.

Now, read Eric Lippert's terrific article on vexing exceptions. Then, you will know that, whenever our code does actually stubmle upon an ArgumentNullException in a constructor, it's not the user screwing up, it's us. These "guard" checks are there to ensure that when we screw up in passing our arguments, we find it out as soon as possible, i.e. upon construction. In production, these guard checks are never expected to throw at all, except if something goes extraordinarily wrong.

One thing that experience teaches is that this is a pattern consumers can very easily get their heads over. Consumers almost always know in advance and expect that they cannot trick constructors into (or simply fall unknowingly into the trap of) creating problematic instances by passing null values. Beyond that, all dependencies passed into a constructor will be valid, simply by virtue of having been constructed themselves.

Sometimes, you need to do special things to create an instance, because that's life. Again, the consumer cannot know that you need to parse something to create an instance. Spare them the instructions of how to use the constructor. Make the constructor private, and create a static factory method that prepares the arguments and constructs the object or fails. Name the method Parse or TryParse if you need to provide a fail-safe. Now consumers know what is going on... the construction of this object takes some processing, and, as a result, it is not entirely trivial (and how could it be, since you need to create the object from a string, a notoriously flexible type). This way, consumers will get the message that you may throw specialized exceptions, or just fail for whatever reason, and they will take precautions.

Also, mind Robert Harvey's excellent advice about the performance cost of exceptions in constructors. Make sure you provide two options, one of which just doesn't care, skips exceptions and fails non-exceptionally. Otherwise, make sure you have some good use for the painfully collected stack traces. All the comments about how exceptions are useful are fine, but do you really care about the stack trace or you simply want to show a fancy message to your users about what they are probably doing wrong? If so, just create an enumeration of a couple of reasons you know could go wrong and output them together in case of failure. Exceptions hurt.

Your second code example seems mostly OK. However, I suggest throwing away the PathHelper, it's just expanding the codebase for no good reason (unless you want to make extension methods, which would not be that bad an idea, but then you have to keep the constructor public, to make it accessible outside the class and this defeats the purpose of protecting consumers from invalid instances). Place the method inside the NamePath class, close to where it matters.

public class NamePath
{
   public string Name {get; private set; }
   public string LastName { get; private set; }

   //Now I can have the constructor private, because I don't want
   //anyone to ever write (e.g.):
   //new NamePath("Humphrey", "Bogart");
   //or they will get some bad behavior in other places of the code
   //where it is assumed that NamePath instances are reasonable and
   //have been created properly.
   private NamePath (string name, string lastName)
   {
       Name = name;
       LastName = lastName;
   }

   public static bool TryParsePath(string path, out NamePath output)
   {
       if (ParsingOkay())
       {
           //create instance and assign to output
           return true;
       }

       output = null;
       return false;
   }
}

Now, after you have put everything in place, think about the following two lines of code:

new NamePath(myPath);
bool parseSuccessful = NamePath.TryParse(myPath, out NamePath namePath, out ParseErrors potentialParseErrors);

Would you prefer to consume the first or the second version? How would you feel as a consumer, if you knew that the two are equivalent (only you have to catch a couple of exceptions in the first case)? How would you feel as a consumer if you just had NamePath class with a constructor, and you had to dive into the instructions to find out if anything special takes place in the constructor, which you have to watch out for? Would you be prepared to do that for every object constructor you come across within a library?

Take-home message

Try to avoid constructing objects from strings when those strings play a special role for the object and need to have a specific format. Write concise methods that make the consumer feel adequate unease to guard themselves and/or read the instructions. Make consumers feel at ease when just calling constructors... any constructors.

In short, there is no special reason to throw or not to throw specific exceptions in a constructor (other than the structural integrity ensuring ones, i.e. ArgumentNullException and ArgumentException). It is just a consistent plan, the simplest one, really, to not perform additional logic and, as a result, not throw any more specialized exceptions at all. It reduces entropy, it is more productive, easier to follow, and requires a lot less guesswork from the consumer perspective.

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a softwareengineering.stackexchange