Fail-fast design vs. limiting constructor logic

https://softwareengineering.stackexchange.com/questions/404230

06-03-2021
|

Question

In any programming task, my preference is to write fail-fast code. That doesn't seem to be too controversial. However, I've also seen many developers say that constructors should do as little as possible. I'm finding that these two goals are often at odds, and conversations over fail-fast design vs. constructor simplicity sometimes devolve into statements of preference.

Consider a class that provides write access to a file. The class accepts a file path in its constructor which must be a path to an existing file. Some developers would say that a constructor shouldn't be accessing the file system, so I might just do a null check:

private readonly string file;

public FileAccessProvider(string file)
{
    this.file = string.IsNullOrEmpty(file)
        ? throw new ArgumentException(nameof(file)) : file;
}

Now, if the provided file doesn't exist, we might not know about it until the class actually attempts to write data to the file. In this case, would it be acceptable to do a check for file existence in the constructor?

private readonly string file;

public FileAccessProvider(string file)
{
    this.file = string.IsNullOrEmpty(file)
        ? throw new ArgumentException(nameof(file)) : file;

    if (!File.Exists(file))
    {
        throw new ArgumentException("File does not exist.");
    }
}

This complicates my constructor, but better adheres to fail-fast design. We could even take this a step further and add code to the constructor that checks if the user has write permissions to the file.

What about a class that has to access a database or network resources?

Is there an accepted side to err on in situations like this?

Solution

Let me step away from your actual File I/O example, as you have noticed by yourself, this introduces aspects to your question you did not intend to ask here.

My answer to your general question is: don't take the idea "limiting constructor logic" too literal!

If the constructor of an object cannot initialize the object properly, throwing an exception is the standard way (and sometimes the only reasonable way) of signaling this to the caller. And if the decision about the initialization being successful or not requires to run a some more or less complex validation logic inside the constructor, that is fine, at least in most cases. That is usually the kind of "logic" which is acceptable inside a constructor.

Of course, there are complex cases where a two-step initialization of an object might be preferable, where the constructor does only some basic initialization and a separate Init method is required for the "heavy lifting". But the reason for such is design should be never a braindead rule like "no work inside the constructor". A sensible reason for such a design could be, for example, if you need to have the basic object construction done in a factory, and the Init method is a virtual method which is called as part of a template method.

OTHER TIPS

I deeply believe in fail fast myself. In addition to promoting system integrity it helps debugging devs by getting them as close to the cause of the issue as possible so they don't have to slog through file after file trying to figure out where a bad value came from.

However, file I/O is not something you should pretend you control. You don't. You're never going to. Say you test that the file exists and then try to use it. Well then Murphy will wait until your code is deployed on a critical system to delete, rename, or move that file between when you tested it and when you attempted to use it.

Every time you use a file, suspect that it might not be there and be ready for it. Nothing you do before hand absolves you of this. It's simply the reality of how I/O works. Don't pretend Murphy won't do this because your app is the only one on the system. If Murphy has to he'll take the form of a rat that gnaws through your data cable.

On top of that, why throw an error if the file doesn't exist? Maybe this is a chance to make the file exist without hassling the user. Fail fast doesn't mean be pessimistic.

If the system is c:\MS\Windows and the user is typing /dev/nul you could be nice and parse their input and realize it's unusable. This is validation. It's the ONE thing constructors should do besides set instance variables. It's not about the current state of the file system. It's about whether your input is crap.

Validation can't involve talking to peripherals. In fact the thing you're validating must be full blown immutable or the test is a waste of time.

Fail fast should never overturn this. Fail fast argues that once attempting to use the file has caused an error you should get that exception going now, not later. But also, not before.

I think you have two very different cases here:

A null-check. This should throw as soon as possible, because it is a bug in your program. That string will never become non-null, there is no way to recover or change. This is just wrong and should be caught immediately. There is no point in continuing with this broken class if the file is null. Please throw and let me know immediately that something is wrong before I have to figure out why it all went to pieces way further down the call chain.
A check for other, external dependencies. Those are not good checks anyway because they are pointless. The state of the file in the constructor says nothing about the state at the time of the call of the actual method. Lets say the file exists and you have write access. That could have changed by the time you make the actual call to the file writing method. So the check is pointless and superfluous. I have to handle exceptions on the actual method anyway, so please don't make me handle it in two places.

This pattern is used in .NET all over the place. For example while all the file handling methods in C# will throw an ArgumentNullException immediately upon null input data, there are no methods to easily check if you have write access. It does not make sense to check beforehand. You will see it when you actually try to write to the file. Knowing that you had write access two minutes ago is not useful at all.

I have never heard of fail fast as a software design principle. I view fail fast as a project or organisational strategy where one wants to learn of project failures as early as possible so one can course correct or abort while limiting the cost of having gone in the wrong direction. It's a risk mitigation strategy when there are a lot of unknowns.

I will also propose a goal of doing work as late as possible, i.e. when it is needed. In your example, I would not check if the file exists or if the name is plausible until a method accessing the file is called. In real-world examples (when classes do more than one kind of work) that time may never arrive.

Maybe it's due to the nature of your example being simplified - I would not do any of these checks. The framework will throw ArgumentNullException or FileNotFoundException (or SecurityException) on it's own.

The benefit of structured exception handling is so one can focus on the happy path throughout and error messages can be passed up the stack without every layer having error checks.

Seems like the provider class should expose a method that actually does the providing and error checking.

Then your class and constructor can be nice and simple (or just take in whatever other dependencies are needed) and not have any side effects. This also lets you add other methods to the class to support different data types or functionality without needing to have constructor overloads or other refactoring.

If I was needing to instantiate this provider, I wouldn't expect it to start making logical choices until I told it to go do something, via a method.

Perhaps better would be:

public class FileAccessProvider
{
    public string Get(string file)
    {
        this.file = string.IsNullOrEmpty(file) ? throw new ArgumentException(nameof(file)) : file;

        if (!File.Exists(file))
        {
            throw new ArgumentException("File does not exist.");
        }

        //get object

        return object;
    }
}

In this case, would it be acceptable to do a check for file existence in the constructor?

No.

Frankly, it's not acceptable to do the null check in the constructor.

(Why? Because very few people know what the behavior is when you throw from a constructor. Is the object created? Can you use it safely? What about if the constructor is in the middle of a type hierarchy?)

Worse, this check (along with the permissions check) are ineffective. Just because the file exists now and you have permissions now doesn't mean that it will exist with the right permissions when you try to use it (and vice versa).

Is there an accepted side to err on in situations like this?

In C# and similar languages, if your simple object creation can fail you'd make a static function that would do the cursory checks (path is nonnull) and return a new object. That would then have enough logic to fail fast when you're sure it will actually fail.

Late to the party but one option here that allows you to keep the constructor simple and also put logic around creation of the object is to use a static factory method.

By hiding the constructor, you can provide various, clearly named ways to create a new object. Any validations or computations that need to be done are placed in the body of the factory method and the constructor.

This is especially useful in cases where you want to create immutable objects (which should be the default choice, IMO.) It tends to be easier to create immutable objects when the constructor is simple. I find I have been able to eliminate the need for 'init' methods that set values after the constructor using this approach.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange