Error handling considerations

https://softwareengineering.stackexchange.com/questions/352702

16-01-2021
|

Question

The problem:

Since long time, I am worried about the exceptions mechanism, because I feel it does not really resolve what it should.

CLAIM: There are long debates outside about this topic, and most of them struggle at comparing exceptions vs returning an error code. This is definitively not the topic here.

Trying to define an error, I would agree with CppCoreGuidelines, from Bjarne Stroustrup & Herb Sutter

An error means that the function cannot achieve its advertised purpose

CLAIM: The exception mechanism is a language semantic for handling errors.

CLAIM: To me, there is "no excuse" to a function for not achieving a task: Either we wrongly defined pre/post conditions so the function cannot ensure results, or some specific exceptional case is not considered important enough for spending time in developing a solution. Considering that, IMO, the difference between normal code and error code handling is (before implementation) a very subjective line.

CLAIM: Using exceptions to indicate when a pre or post condition is not keep is another purpose of the exception mechanism, mainly for debugging purpose. I do not target this usage of exceptions here.

In many books, tutorials and other sources, they tend to show error handling as a quite objective science, that is solved with exceptions and you just need to catch them for having a robust software, able to recover from any situation. But my several years as a developer make me to see the problem from a different approach:

Programmers tends to simplify their task by throwing exceptions when the specific case seem too rare to be implemented carefully. Typical cases of this are: out of memory issues, disk full issues, corrupted file issues, etc. This might be sufficient, but is not always decided from an architectural level.
Programmers tends not reading carefully documentation about exceptions in libraries, and are usually not aware of which and when a function throws. Furthermore, even when they know, they don't really manage them.
Programmers tends not catching exceptions early enough, and when they do, it is mostly to log and throw further. (refer to first point).

This has two consequences:

Errors happening frequently are detected early in development and debugged (which is good).
Rare exceptions are not managed and make the system to crash (with a nice log message) at the user home. Some times the error is reported, or not even.

Considering that, IMO the main purpose of an error mechanism should be:

Make visible in code where some specific case is not managed.
Communicate the issue runtime to related code (at least the caller) when this situation happens.
Provides recovery mechanisms

The main flaw of the exception semantic as an error handling mechanism is IMO: it is easy to see where a throw is in the source code, but absolutely not evident to know if a specific function could throw by looking at the declaration. This bring all the problem that I introduced above.

The language do not enforce and check the error code as strictly as it make for other aspects of the language (e.g. strong types of variables)

A try for solution

In the intention of improving this, I developed a very simple error handling system, which tries to put the error handling at the same level of importance than the normal code.

The idea is:

Each (relevant) function receive a reference to a success very light object, and may set it to an error status in case. The object is very light until a error with text is saved.
A function is encouraged to skip its task if the object provided contain already an error.
An error must never be override.

The full design obviously consider thoroughly each aspect (about 10 pages), also how to apply it to OOP.

Example of the Success class:

class Success
{
public:
    enum SuccessStatus
    {
        ok = 0,             // All is fine
        error = 1,          // Any error has been reached
        uninitialized = 2,  // Initialization is required
        finished = 3,       // This object already performed its task and is not useful anymore
        unimplemented = 4,  // This feature is not implemented already
    };

    Success(){}
    Success( const Success& v);
    virtual ~Success() = default;
    virtual Success& operator= (const Success& v);

    // Comparators
    virtual bool operator==( const Success& s)const { return (this->status==s.status && this->stateStr==s.stateStr);}
    virtual bool operator!=( const Success& s)const { return (this->status!=s.status || this->stateStr==s.stateStr);}

    // Retrieve if the status is not "ok"
    virtual bool operator!() const { return status!=ok;}

    // Retrieve if the status is "ok"
    operator bool() const { return status==ok;}

    // Set a new status
    virtual Success& set( SuccessStatus status, std::string msg="");
    virtual void reset();

    virtual std::string toString() const{ return stateStr;}
    virtual SuccessStatus getStatus() const { return status; }
    virtual operator SuccessStatus() const { return status; }

private:
    std::string stateStr;
    SuccessStatus status = Success::ok;
};

Usage:

double mySqrt( Success& s, double v)
{
    double result = 0.0;
    if (!s) ; // do nothing
    else if (v<0.0) s.set(Error, "Square root require non-negative input.");
    else result = std::sqrt(v);
    return result;
}

Success s;
mySqrt(s, 144.0);
otherStuff(s);
saveStuff(s);
if (s) /*All is good*/;
else cout << s << endl;

I used that in many of my (own) code and it force the programmer (me) to think further about possible exceptional cases and how to solve them (good). However, it has a learning curve and don't integrate well with code that do now use it.

The question

I would like to understand better the implications of using such a paradigm in a project:

Is the premise to the problem correct? or Did I missed something relevant?
Is the solution a good architectural idea? or the price is too high?

EDIT:

Comparison between methods:

//Exceptions:

    // Incorrect
    File f = open("text.txt"); // Could throw but nothing tell it! Will crash
    save(f);

    // Correct
    File f;
    try
    {
        f = open("text.txt");
        save(f);
    }
    catch( ... )
    {
        // do something 
    }

//Error code (mixed):

    // Incorrect
    File f = open("text.txt"); //Nothing tell you it may fail! Will crash
    save(f);

    // Correct
    File f = open("text.txt");
    if (f) save(f);

//Error code (pure);

    // Incorrect
    File f;
    open(f, "text.txt"); //Easy to forget the return value! will crash
    save(f);

    //Correct
    File f;
    Error er = open(f, "text.txt");
    if (!er) save(f);

//Success mechanism:

    Success s;
    File f;
    open(s, "text.txt");
    save(s, f); //s cannot be avoided, will never crash.
    if (s) ... //optional. If you created s, you probably don't forget it.

Solution

Error-handling is perhaps the hardest portion of a program.

In general, realizing that there is an error condition is easy; however signalling it in a way that cannot be circumvented and handling it appropriately (see Abrahams' Exception Safety levels) is really hard.

In C, signalling errors is done by a return code, which is isomorphic to your solution.

C++ introduced exceptions because of the short-coming of such an approach; namely, it only works if callers remember to check whether an error occurred or not and fails apart horribly otherwise. Whenever you find yourself saying "It's OK as long as every time..." you have a problem; humans are not that meticulous, even when they care.

The problem, however, is that exceptions have their own issues. Namely, invisible/hidden control flow. This was intended: hiding the error case so that the logic of the code is not obfuscated by the error handling boilerplate. It makes the "happy path" much clearer (and fast!), at the cost of making the error paths nigh inscrutable.

I find it interesting to look at how other languages approach the issue:

Java has checked exceptions (and unchecked ones),
Go uses error codes/panics,
Rust uses sum types/panics).
FP languages in general.

C++ used to have some form of checked exceptions, you may have noticed it has been deprecated and simplified toward a basic noexcept(<bool>) instead: either a function is declared to possibly throw, or it's declared never to. Checked exceptions are somewhat problematic in that they lack extensibility, which can cause awkward mappings/nesting. And convoluted exception hierarchies (one of the prime use cases of virtual inheritance is exceptions...).

In contrast, Go and Rust take the approach that:

errors should be signaled in band,
exception should be used for really exceptional situations.

The latter is rather evident in that (1) they name their exceptions panics and (2) there is no type hierarchy/complicated clause here. The language does not offer facilities to inspect the content of a "panic": no type hierarchy, no user-defined content, just a "oops, things went so wrong there's no possible recovery".

This effectively encourages users to use proper error handling, whilst still leaving an easy way to bail out in exceptional situations (such as: "wait, I haven't implement that yet!").

Of course, the Go approach unfortunately is much like yours in that you can easily forget to check the error...

... the Rust approach however is mostly centered around two types:

Option, which is similar to std::optional,
Result, which is a two possibilities variant: Ok and Err.

this is much neater because there is no opportunity for accidentally using a result without having checked for success: if you do, the program panics.

FP languages form their error handling in constructs which can be split in three layers: - Functor - Applicative / Alternative - Monads / Alternative

Let's have a look at Haskell's Functor typeclass:

class Functor m where
  fmap :: (a -> b) -> m a -> m b

First of all, typeclasses are somewhat similar but not equal to interfaces. Haskell's function signatures look a bit scary on a first look. But let's decipher them. The function fmap takes a function as first parameter which is somewhat similar to std::function<a,b>. The next thing is an m a. You can imagine m as something like std::vector and m a as something like std::vector<a>. But the difference is, that m a doesn't say it has to be explicitly std:vector. So it could be a std::option, too. By telling the language that we have an instance for the typeclass Functor for a specific type like std::vector or std::option, we can use the function fmap for that type. The same must be done for the typeclasses Applicative, Alternative and Monad which allows you to do stateful, possible failing computations. The Alternative typeclass implements error recovery abstractions. By that you can say something like a <|> b meaning it's either term a or term b. If neither of both computations succeed, it's still an error.

Let's have a look at Haskell's Maybe type.

data Maybe a
  = Nothing
  | Just a

This means, that where you expect a Maybe a, you get either Nothing or Just a. When looking at fmap from above, an implementation could look like

fmap f m = case m of
  Nothing -> Nothing
  Just a -> Just (f a)

The case ... of expression is called pattern matching and resembles what is known in the OOP world as visitor pattern. Imagine the line case m of as m.apply(...) and the dots is the instantiation of a class implementing the dispatch functions. The lines below the case ... of expression are the respective dispatch functions bringing the fields of the class directly in scope by name. In the Nothing branch we create Nothing and in the Just a branch we name our only value a and create another Just ... with the transformation function f applied to a. Read it as: new Just(f(a)).

This can now handle erroneous computations while abstracting the actual error checks away. There exist implementations for the other interfaces which makes this kind of computations very powerful. Actually, Maybe is the inspiration for Rust's Option-Type.

I would there encourage you to rework your Success class toward a Result instead. Alexandrescu actually proposed something really close, called expected<T>, for which standard proposals were made.

I will stick to the Rust naming and API simply because... it's documented and works. Of course, Rust has a nifty ? suffix operator which would make the code much sweeter; in C++, we'll use the TRY macro and GCC's statements expression to emulate it.

template <typename E>
struct Error {
    Error(E e): error(std::move(e)) {}

    E error;
};

template <typename E>
Error<E> error(E e) { return Error<E>(std::move(e)); }

template <typename T, typename E>
struct [[nodiscard]] Result {
    template <typename U>
    Result(U u): ok(true), data(std::move(u)), error() {}

    template <typename F>
    Result(Error<F> f): ok(false), data(), error(std::move(f.error)) {}

    template <typename U, typename F>
    Result(Result<U, F> other):
        ok(other.ok), data(std::move(other.data)),  error(std::move(other.error)) {}

    bool ok = false;
    T data;
    E error;
};

#define TRY(Expr_) \
    ({ auto result = (Expr_); \
       if (!result.ok) { return result; } \
       std::move(result.data); })

Note: this Result is a placeholder. A proper implementation would use encapsulation and a union. It's sufficient to get the point across however.

Which allows me to write (see it in action):

Result<double, std::string> sqrt(double x) {
    if (x < 0) {
        return error("sqrt does not accept negative numbers");
    }
    return x;
}

Result<double, std::string> double_sqrt(double x) {
    auto y = TRY(sqrt(x));
    return sqrt(y);
}

which I find really neat:

unlike the use of error codes (or your Success class), forgetting to check for errors will result in a runtime error¹ rather than some random behavior,
unlike the use of exceptions, it is apparent at the call site which functions can fail so there's no surprise.
with C++-2X standard, we may get concepts in the standard. This would make this kind of programming far more pleasuring as we could leave the choice over the error kind. E.g. with an implementation of std::vector as result, we could compute all possible solutions at once. Or we could choose to improve error handling, as you proposed.

¹ With a properly encapsulated Result implementation ;)

Note: unlike exception, this lightweight Result does not have backtraces, which makes logging less efficient; you may find it useful to at least log the file/line number at which the error message is generated, and to generally write a rich error message. This can be compounded by capturing the file/line each time the TRY macro is used, essentially creating the backtrace manually, or using platform-specific code and libraries such as libbacktrace to list the symbols in the callstack.

There is one big caveat though: existing C++ libraries, and even std, are based on exceptions. It'll be an uphill battle to use this style, as any 3rd party library's API must be wrapped in an adapter...

OTHER TIPS

CLAIM: The exception mechanism is a language semantic for handling errors

exceptions are a control-flow mechanism. The motivation for this control-flow mechanism, was specifically separating error handling from non-error handling code, in the common case that error handling is very repetitive and bears little relevance to the main part of the logic.

CLAIM: To me, there is "no excuse" to a function for not achieving a task: Either we wrongly defined pre/post conditions so the function cannot ensure results, or some specific exceptional case is not considered important enough for spending time in developing a solution

Consider: I try to create a file. The storage device is full.

Now, this isn't a failure to define my preconditions: you can't use "there must be enough storage" as a precondition in general, because shared storage is subject to race conditions that make this impossible to satisfy.

So, should my program somehow free some space and then proceed successfully, otherwise I'm just too lazy to "develop a solution"? This seems frankly nonsensical. The "solution" to managing shared storage is outside the scope of my program, and allowing my program to fail gracefully, and be re-run once the user has either released some space, or added some more storage, is fine.

What your success class does is interleave error-handling very explicitly with your program logic. Every single function needs to check, before running, whether some error already occurred which means it shouldn't do anything. Every library function needs to be wrapped in another function, with one more argument (and hopefully perfect forwarding), which does exactly the same thing.

Note also that your mySqrt function needs to return a value even if it failed (or a prior function had failed). So, you're either returning a magic value (like NaN), or injecting an indeterminate value into your program and hoping nothing uses that without checking the success state you've threaded through your execution.

For correctness - and performance - it's much better to pass control back out of scope once you can't make any progress. Exceptions and C-style explicit error checking with early return both accomplish this.

For comparison, an example of your idea which really does work is the Error monad in Haskell. The advantage over your system is that you write the bulk of your logic normally, and then wrap it in the monad which takes care of halting evaluation when one step fails. This way the only code touching the error-handling system directly is the code that might fail (throw an error) and the code that needs to cope with the failure (catch an exception).

I'm not sure that monad style and lazy evaluation translate well to C++ though.

I would like to understand better the implications of using such a paradigm in a project:

Is the premise to the problem correct? or Did I missed something relevant?

Is the solution a good architectural idea? or the price is too high?

Your approach brings some large problems into your source code:

it relies on the client code always remembering to check the value of s. This is common with the use return codes for error handling approach, and one of the reasons that exceptions were introduced into the language: with exceptions, if you fail, you do not fail silently.
the more code you write with this approach, the more error boilerplate code you will have to add as well, for error handling (your code is no longer minimalistic) and your maintenance effort goes up.

But my several years as a developer make me to see the problem from a different approach:

The solutions for these problems should be approached at technical lead level or team level:

Programmers tend to simplify their task by throwing exceptions when the specific case seem too rare to be implemented carefully. Typical cases of this are: out of memory issues, disk full issues, corrupted file issues, etc. This might be sufficient, but is not always decided from an architectural level.

If you find yourself handling every type of exception that may be thrown, all the time, then the design is not good; What errors get handled, should be decided according to the specifications for the project, not according to what devs feel like implementing.

Address by setting up automated testing, separating specification of the unit tests and implementation (have two different persons do this).

Programmers tends not reading carefully documentation [...] Furthermore, even when they know, they don't really manage them.

You will not address this by writing more code. I think your best bet is meticulously-applied code reviews.

Programmers tends not catching exceptions early enough, and when they do, it is mostly to log and throw further. (refer to first point).

Proper error handling is hard, but less tedious with exceptions than with return values (whether they are actually returned or passed as i/o arguments).

The most tricky part of error handling is not how you receive the error, but how to make sure your application keeps a consistent state in the presence of errors.

To address this, more attention needs to be allocated to identifying and running in error conditions (more testing, more unit/integration tests, etc).

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange