Why is it bad to use exceptions for handling this type of validation? It seems like it makes the code so much cleaner

https://softwareengineering.stackexchange.com/questions/385808

19-02-2021
|

题

I'm working on a .NET Core REST API and I'm writing a service class to create new user accounts. I have the following code:

    public async Task<UserDto> RegisterNewUserAccount(CreateAccountDto userInfo)
    {
        EnsureUserDoesNotAlreadyExist(userInfo);
        EnsureRfidCardIsNotAlreadyClaimed(userInfo);

        var user = new UserDto()
        {
            Email = userInfo.EmailAddress.ToLower(),
            FirstAndLastName = userInfo.FirstAndLastName,
            Rank = 1200,
            JoinedTimestamp = DateTime.UtcNow.ToString()
        };

        await _usersRepository.CreateUser(user);

        var passwordHasher = new PasswordHasher<AccountCredentials>();
        var credentials = new AccountCredentials()
        {
            Email = userInfo.EmailAddress.ToLower(),
            HashedPassword = passwordHasher.HashPassword(null, userInfo.Password)
        };

        await _accountCredentialsRepository.InsertNewUserCredentials(credentials);
        await _emailSender.SendNewUserWelcomeEmail(userInfo);

        return await _usersRepository.GetUserWithEmail(user.Email);
    }

    private void EnsureUserDoesNotAlreadyExist(CreateAccountDto userInfo)
    {
        if (_usersRepository.GetUserWithEmail(userInfo.EmailAddress) != null)
        {
            throw new ResourceAlreadyExistsException("Email already in use.");
        }
    }

    private void EnsureRfidCardIsNotAlreadyClaimed(CreateAccountDto userInfo)
    {
        if (_usersRepository.GetUserWithRfid(userInfo.RfidNumber) != null)
        {
            throw new ResourceAlreadyExistsException("RFID card already in use.");
        }
    }

To me, this code is clean, readable, and obvious in what it does. However, it seems like a lot of people are strictly against using exceptions for this type of logic. I know exceptions shouldn't be used for normal control flow, but for something like this, it seems so natural. The one thing I've considered changing is renaming the two helper methods to ThrowIfUserAlreadyExists and ThrowIfRfidCardIsAlreadyClaimed, to make it more clear that these private helper methods serve no purpose other than to throw an exception if a condition is met.

Obviously, I wouldn't want to use exceptions for validating general user input, such as making sure their password meets the requirements, etc., but in cases like this the line seems blurred.

If you believe this is bad practice, how would you write code such as this instead? Why is it bad to do it the way I have?

For instance, if I tried to do so using a DoesUserWithEmailAlreadyExist() check before I called my method, what happens if that user goes away between that check and now? Furthermore, how would I even go about returning some type of return code here instead without having a giant mess? I can only have one return type on the function, of course. Are you saying I seriously have to wrap every response from the method in some kind of CreateUserResult object or something? That seems obnoxious and like it would leave the code a mess.

解决方案

The first issue I have with using exceptions for validation is that I would typically expect the outcome of the validation to potentially churn up multiple errors with the same data.

Exceptions are useful for non-Happy Path scenarios when some code fails and the best course of action is to unwind the stack as a consequence, passing the error much further up to something which can handle it gracefully and recover or otherwise leave the system in a good state.

Stack unwinding makes exceptions generally unsuitable for validation as I would expect the whole validation process to continue until all checks have been run after any error with the data is found, so that the user is able to have a complete report on the issues in their data from a single POST/PUT request.

Update: As pointed out in the comments, it is also technically possible to treat an exception as a return value by bundling all the validation issues into a single exception and throwing that at the end, however I also wouldn't recommend or advocate this either.

Validation is a user-facing functional feature of a system, and is about identifying issues with data, rather than errors in the behaviour of the system. A validation error being spotted still implies that the system is behaving correctly, however an exception should be used to indicate that the system or one of its dependencies is not behaving correctly.

C# / ASP.NET Specific stuff:

In the specific case of ASP.NET Core, a possible solution could be to use the Custom Model Validation described here: https://docs.microsoft.com/en-us/aspnet/core/mvc/models/validation?view=aspnetcore-2.2

Another possible alternative could be to use FluentValidation: https://fluentvalidation.net/aspnet

其他提示

It seems to me that in your example the two exceptions are exceptional. ie. you don't expect them to be thrown very often.

The problem with the overall pattern is usually you DO expect validation checks to fail.

I would code this up the same way you have. But I would in addition have code which performed the same validation checks before submitting the create user request.

These checks would not throw exception, they would return a list of human readable validation failures, with enough meta data to enable highlighting of problematic fields etc.

Sure I need to recheck when the actual request comes through, in case of race conditions. But that is now an exceptional and unlikely event. I will just log it and tell the user "error, please retry"

In your particular example with context you provided:
- without examples of how controller endpoint is consumed

It could be ok to throw exceptions when some "exceptional" validations fails. As you mentioned that you will not use exception for non-exceptional validations (general user input)
I think name Ensure.. should be ok for method which will throw an exception. For example HttpResponseMessage.EnsureSuccessStatusCode use same naming approach.

Without examples of how controller endpoint is consumed

Client of this API need to handle possible exceptions, and pretty sure it is already handled, but if client want to distinguish between Server Error(500) and exceptional validation error, then it will be difficult because with current implementation 500 Server Error response will be returned in all cases.
Following REST API HTTP Status codes, I think returning response with status code 422 Unprocessable Entity could be one of the possible approaches to help client separate exceptional validation with server errors.
That mean inside controller we need to use try .. catch which will not be nice and readable anymore I guess.

Suggestion from performance perspective(not a cult-cargo ;)) by Microsoft:
DA0007: Avoid using exceptions for control flow

The biggest practical problem you’ll encounter is performance difference. The cost of checking a return value versus stopping execution, unwinding the stack, and otherwise dealing with exceptions is about an order of magnitude.

“That’s not a big deal”, you say, “the call to the DB is way more expensive!”

And it is. The problem comes that if you’re at any sort of scale, your servers will be tuned to handle your normal load from normal traffic.

What happens when some asshole decides to flood your site with requests - but not just any request, the same request over and over to create some user that already exists? Now you have an order of magnitude more traffic than normal, and the requests take an order of magnitude more load from your web servers because they’re going down the exception route.

Your servers melt, the site goes down, bad times all around. Modern cloud systems can deal with most Denial of Service attacks, but that exponential effect due to common exception cases will usually be enough to overwhelm them. At best it will reduce the response time and increase the cost of dealing with it.

Exceptions are not to be used for flow control.

The use for exceptions for this kind of validation is "bad" because, for the code as given, you actually do not want this kind of validation. You write:

public async Task<UserDto> RegisterNewUserAccount(CreateAccountDto userInfo)
{
    EnsureUserDoesNotAlreadyExist(userInfo);
    ...
    await _usersRepository.CreateUser(user);

What you have here is a race condition:

You first check whether the new item fulfills some uniqueness requirements.
Then you run some (little) code and only then you try to insert the item.
=> By which time the uniqueness requirements may already be violated by by some parallel operation!

So, disregarding any performance issues, and connected to what Ben wrote above:

The first issue I have with using exceptions for validation is that I would typically expect the outcome of the validation to potentially churn up multiple errors with the same data.

The problem you have is: For reporting to the user you want to collect all pre validation, and for this exceptions certainly don't make sense.

For validation inside RegisterNewUserAccount you must validate atomically, i.e. CreateUser must either succeed and commit its changes or fail. For this exceptions are OK, and they are exceptional, because you already validated all input. These most likely would just report a unique constraint violation from the backing store.

The main argument against flow-by-exception is not one of code readability, but one of code performance.

Exception handling is expensive. And this is by design, or at least a direct consequence of its design: detailed data logging. The idea is that when an error occurs, there is a high need for fully detailed information (hence the elaborate stack trace) and less of a need for high performance. People generally only expect good performance from successful operations.
.Net does the exception data aggregation for you. But that means that whenever you throw an exception, .Net will always go through the expensive and slow motions of gathering that data, and you can't avoid that.

Don't get me wrong, there's also a secondary readability argument. Quite some time ago, we started (almost) universally agreeing that goto logic inevitably leads to spaghetti code because the flow of the application is no longer trivially readable or understandable. throw and catch are under the hood still just a goto statement.

However, goto is an acceptable exception handling approach, because we again run into different priorities when faced with errors. More often than not, you want exception to bubble up to a high level, often crossing multiple layers. And when you don't use a goto-like call, that means that every layer (and thus every method you pass) needs to account for these error return values. It's nigh impossible to expect every developer to account for every possible error at all times, and they will inevitably forget to handle some fringe cases once in a while. And even if they did remember it at all times, the handling of these error cases would lead to more bulky code, and it would take a significant amount of extra time to always implement it.

To sum it up, there are cases pro exception throwing (handling unforeseen errors), and there are cases con (handling intended business logic). And it's up to developers to ensure that exceptions are only thrown in cases where the error is truly exceptional, meaning that you didn't explicitly account for it in business logic, most commonly because there are so many possible errors and it's be a folly to try and explicitly cover for everything (as an extreme example, how often have you written code to doublecheck that the HDD is still accessible?)

许可以下： CC-BY-SA 和归因

不隶属于 softwareengineering.stackexchange