Where to validate user input - Constructor, Validation object, or Factory method?

https://softwareengineering.stackexchange.com/questions/411353

12-03-2021
|

質問

Let's say your application takes in a user request and you don't trust the front-end validation (because you never trust the front-end validation). In your controller or other handler, you want to convert this input into a domain object, since you'll need to pass it around a bit. In this scenario, what's the best practice for defensively validating the data before creating the object.

I have three proposals:

Pass the data to the object constructor and let it perform any necessary integrity checks, throwing an exception if any issues are found.

The main issue I have with this is the lack of detail returned to the caller explaining what happened. If I validate it externally in some kind of Validator class, I could construct a ValidationResult object which exposes IsValid and Errors list that can be used for a better client experience.

Use the Validator class I described above. If Validator.IsValid is/returns True, you can pass the same parameters to the constructor to create the object.

The issue with this one is that you can't guarantee that the Validator will always be used. If the constructor is called with invalid data, this could lead to crashes or subtly invalid states.

The last option is to only expose object creation through a factory method, using the Validator object (most likely passed in as a parameter).

The issues with this option are sort of language dependent. In a strongly typed language, this would appear to have the same issues as #1 since your method would be expected to return an object of that type. You could return null and then have a GetErrors method, but that feels messy to me. In a loosely-typed language, you could return either the object or the ValidationResult. That's less messy, but still leads to a lot of conditional logic to use it.

For context, I currently use approach #2, but my applications are small, one-person projects. I've been programming for a while now, but my professional life has been in QA for some time now and I'm trying to refresh myself now by working as cleanly as possible.

解決

You hit on good points but you sometimes miss a second option. The gist of my response to your question is that it can be done, it just requires effort to implement exactly what you want.

In your controller or other handler

You didn't explicitly claim otherwise, but I do want to point out here that different kinds of validation exist and belong in different locations.

For example, a controller should validate parseability (e.g. can "2020-06-12" be parsed back to a valid date?) whereas your business layer should validate the business needs (e.g. is 2020-06-12 in the allowed period for this user?)

Pass the data to the object constructor and let it perform any necessary integrity checks, throwing an exception if any issues are found.

The main issue I have with this is the lack of detail returned to the caller explaining what happened.

As much as "flow by exception" should generally be avoided, exceptions certainly don't lack detail. If you take this route, then your exception should be some kind of validation exception type which extends the exception class with all of the information you need to know about the validation failure.

The issue with this one is that you can't guarantee that the Validator will always be used.

You can, but it requires more effort. You could create a result class (e.g. ValidatedResult<T>) which effectively wraps a single value (i.e. the T). If you make sure that only the validator can instantiate this class (using nested classes or access modifiers), then you can guarantee that any ValidatedResult<T> object has been processed by a validator.

This makes sense in cases where each T has one type of validation, because otherwise you still can't be sure if your T was validated using the specific validation you're expecting it to be.

To further solve that issue of having multiple kinds of validation for a type, you can start extending these result types to explicitly specify which validation they belong to (e.g. ContainsNoProfanityValidationResult : ValidationResult<string>).

As you can see, this starts requiring more and more effort to implement, but it does give you stricter control and more solid guarantees, which you're specifically looking for.

However, I somewhat disagree about the necessity of doing it so strictly though. There's a difference between guarding against malevolent attacks and guarding against developer forgetfulness. I presume only the latter is really applicable here, and this should generally be caught with unit tests as validation failures lead to changes in public behavior (i.e. refusing to perform a requested action), which tests can and should catch.

In a loosely-typed language, you could return either the object or the ValidationResult. That's less messy

I disagree. If you look at the big picture, loose typing is messier than strong typing. However, strict typing requires a bit more pedantry to satisfy the compiler. That's not messy, it just takes a bit of effort - but it pays back dividends in any sufficiently large codebase.

I would say that any codebase in which you're worried about forgetting to use a validation (enough so that you want a preventative architecture) is more than big enough for strong typing to pay those dividends in the long run.

you could return either the object or the ValidationResult.

This circles back to my earlier point, "validation result" should include both the successes and the failures! Always return a validation result, and then inspect it to see if it contains a success.

Semantics matter here. Boiled down to its core essence, validation does not need to give you back the value you put in (since you already knew it), it just needs to tell you if it passes the validation or not. For a basic validation algorithm, there's no need to return the object you already passed into it.

However, if you do take the time and effort to encapsulate your values in a validation result (presumably with an additional boolean to confirm that it's indeed a success), then you can both:

Rest easy knowing that this value was already successfully validated and certified, allowing your domain to essentially require parameter values that were already validated
Reusably log validation failures and report the actual value that was being used.

Given the concerns that you listed in the question, using a validation result is a win-win here.

他のヒント

You validate to preserve invariants. Invariants exist for different reasons. Validation should live close to the reason for the invariant.

For example, if an object has integer division in it's behavior then zero is usually excluded from the divisor. Validation code to protect from that should live in the same class as the code that makes it necessary.

However, if such invariants do not emerge from the objects behavior, but rather emerge from a chosen use of the object then it's more appropriate to validate in the particular factory that supports this use.

You are not mentioning which language you are using, so this answer may be more or less useful depending on what language you are actually using. But even in languages that does not have support for this, you can add it yourself by using a library or implementing a "poor mans" version.

I believe that the best option is actually to use #3 (factory method) returning a result that is a discriminated union. This gives you the best of both worlds. If you havent heard of discriminated unions before, here is an explanation of that concept from F#: https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/discriminated-unions

This means that you could have a usage pattern like this

type CreationFault = {
  // whatever information you want about faults
}

type CreationResult<T> = 
  | Created of T
  | Failed of CreationFault list

let createWithFactory (fruitName : string) : CreationResult<Fruit> =
  // return a creationresult either containing a fruit or a list of errors
  // depending on if it works or not

let maybeAFruit = createWithFactory "banana"

// Now maybeAFruit is of the type CreationResult<Fruit>, so you cant actually touch
// the fruit unless you check which union case you got first

match maybeAFruit with
| Created(fruit) -> // Do something with your fruit
| Failed(errors) -> // Do something with your errors

This gives you the best of both worlds and leverages the type system so that you cant accidentially use an unvalidated object or "miss" checking the validation results. Similar patterns can be done even if the language doesnt have native support for Discriminated unions. C# does not have this construct in the language but you can for example use the OneOf library to achieve the same thing (examples in the github readme) (https://github.com/mcintyre321/OneOf).

ライセンス： CC-BY-SA と帰属

所属していません softwareengineering.stackexchange