Question

I am writing a library for some data structures in C that will be used in embedded systems. I have had issues designing and coming up with a solid error handling plan. This API is only subject to logic errors which is why I am so conflicted. By this I mean the preconditions might be: "x != NULL" or "index < size of container".

Doing a lot of research on the forums here, it seems as though their are several approaches for error handling in C:

  1. Use errno. I believe the general sentiment is errno should be avoided at all costs, and its design philosophy is outdated, so this is a no-go.
  2. Use error codes. Every function could either return error codes, or the user could pass a pointer for the error codes to be assigned. I believe returning error codes is much more elegant than the latter, but some functions will feel clunky because the user will have to pass a pointer to get the "output" of the function. One of the functions is "get_index_of_object". You can see how this would be anything but convenient.
  3. Use asserts that will be disabled in production build. This is what I am leaning towards as of now, since these data structures will be used so often, the performance boost of having the asserts disabled in production build might be noticeable (although I have no data to support this as of now). As previously mentioned, this API can only be affected by logic errors (users not respecting the preconditions of the functions). To my understanding, asserting logic errors for debugging purposes is encouraged, but is it good practice to use asserts for validating function arguments in a public API? Especially when it will be used in embedded systems?
  4. Check the preconditions, and return early if they are violated. My biggest issue with this approach is debugging might be a nightmare for someone years later. Some functions could return, say "NULL", but others such as "get_index_of_object" can't return a value that will describe error.
  5. Use a combination of the above four error handling methods. I would like to avoid code bloat and keep things as convenient as possible, but this is an option.

I am curious as to what yall have to say.

Was it helpful?

Solution

Speaking as a developer for small embedded systems, I would take asserts over any of the other options.

If the space for your code is limited, then you can easily get into a situation where every byte counts. In that situation, asserts (especially the assert macro) has the advantage that you can disable the sanity checks to the point that they don't contribute to the code size.
Of course, that should only be done when you are confident that the calling code doesn't break the contracts of the API, but that can often be checked in a more space-lenient environment.

In most embedded development environments, the assert macro has the added benefit of immediately transferring you to a breakpoint in the debugger when the assertion fails, so you can immediately inspect why the assertion failed.

The general guidance for not using asserts in public API's is geared more towards situations where unsanitized user-input can reach the API (for example, a web-API) or where the implementation of the API isn't available to the programmer calling it (for example, a closed-source, binary-only library).
In the embedded world, even proprietary libraries generally come with the source code available because the variability in processors and tools makes it prohibitive to distribute binaries for all of them.

OTHER TIPS

This is a common misconception:

To my understanding, asserting logic errors for debugging purposes is encouraged,...

In spite of the fact that a lot of developers use them for that purpose, assertions are absolutely not meant for logical error checking. If an assertion fires, it is meant to indicate a problem with the programmer, not the program. Assertions are used to validate the programmer's assumptions about the structure and design of the program.

For example, let's say you have a public API:

int get_value(void *p) {
    if (p) {
        return getValueImpl(p);
    }
    return APP_ERROR_NULL_POINTER;
}

with the associated internal implementation:

static int getValueImpl(void *p) {
    assert(p); /* this should have been checked already */
    return 3;
}

The assertion is used to validate the programmer's assumption that all inputs to public API's are sanitized before being passed to internal code. That assertion should NEVER, EVER FIRE. If it does, it means that someone messed up writing a public API that didn't sanitize user input. This is a structural problem, not a logic error.

In your particular case, asserts won't really work either, although I'm sure their small footprint is appealing in an embedded environment.

3.Use asserts that will be disabled in production build.

Logic errors happen at runtime. If your library has no logical error checking in place other than assertions that are omitted in production code, what happens if the client violates an API contract then? Is there any way for the client to know that they have violated the contract at runtime?

Let's say you release your library as a linkable binary with assertions enabled. If the client code violates an API contract then, the assertion mechanism will crash the client's process. Is that the intended behavior?

The thing that's easy to misconstrue about designing by contract is the fact that code contracts are only enforceable within the same codebase. You can't force client code to abide by a code contract in a public API. All you can do is publish the contract, and respond in a well-defined manner to any violations of that contract. The most common way to do that in C is to return an error code that client code is required (by contract) to check. If they don't check it, then that's their problem, not yours.

As to API's that "can't" return an error code because they already return an integer value, the commonly-accepted and understood way to do that in C is to return the integer value in a pointer, and the error code in the function's return value. It's not a question of "convenience" from an experienced C developer's point of view, it's simply how it is normally done.

I favor using asserts, and leaving them on for production code. Their cost is more than outweighed by the benefits they provide.

The assert macro is a role model of simplicity and elegance. It essentially boils down to:

define assert(e) ((e) ? (void)0 : abort()) 

except that it will print an error message to stderr if the assertion fails. The assertion

assert( size <= LIMIT );

will abort the program and print an error message that looks like this:

Assertion violation: file tripe.c, line 34: size <= LIMIT

Which is just about as good an outcome as you could expect in a programming language that doesn't have exceptions. No additional handling effort on the part of the caller (such as checking return codes) is required.

This article explains how to use asserts to provide preconditions and postconditions for your C functions, essentially implementing code contracts (no small feat since C doesn't have any intrinsic error-handling built-in).

Never use asserts to check public API preconditions, especially if you develop a library.

You don't control how other people call your library. Every crash in your library is a bug in that library. A library shouldn't crash just because it received bad input, it should return an error instead. Think about fuzz testing: it generates invalid input to check that your software can recover from it, which is the normal behavior.

Let me take an example: I once had a crash on Microsoft Word when opening a specific document. OpenOffice at that time could open it and helped me recover my data. Even if I had an assertion in Word telling the file was badly formed, which program do you think did a better job? This is not specific to the desktop world. I worked on embedded systems (Set Top Boxes) where you had multimedia files to process and display: you not allowed to crash because a media file is corrupted, at worse you just trigger an error and don't play the media.

You think asserting helps getting to the place where the error happened? Use logging and you'll get the same effect, but without crashing forcedly. The assertion doesn't make a distinction beween recoverable and non-recoverable errors: you always stop. But logging allows you to know where and why it failed and lets you decide if you can continue or not.

Think of your customer that wants its app running, and each time a certain condition happens, it crashes. I think the customer may prefer to have some log or error showing up and the rest of the application resuming if the error is not a fatal one. You don't want any crashers in prodution code and an assert is that an intentional crasher. If assertions are disabled in production then it gets worse: your whole error-handling logic disappeared, and instead of crashing with an assertion message you're now crashing (maybe much much) later with no message, just because you didn't detect bad input and continued to process it.

If you're developping your own application, you might want to use assert to check for your own bugs however, in your internal calls, but for a library it's a no-no.

As for your options:

  1. errno: I agree that errno doesn't fit current programming practices, especially with multithreading
  2. Return code, or return code + pointer to error is good. That is what GLib does. The pointer to error is optional but can be used to get a relevant message and return the error on layers above. In the cas of your "get_index_of_object", that isn't even necessary: an index is never negative so just returning -1 is enough to say your object was not found.
  3. Disabling asserts in production: this breaks your logging logic, as people won't do logging + assert, they'll think the assert is enough. Again, you need logging in production and avoid crashes at all costs, so you can't rely on assertions.
  4. Return early is what the GLib does. Look at g_return_if_fail or g_return_val_if_fail. You want to have a trace of runtime errors: again use logging, with different levels (debug, warning, info, error). And if that error is considered fatal and can't be recovered, then yes, stop.
  5. Be pragmatic. Sometimes the error code is enough, sometimes no and you'll need a location for a more specific error code and error message. Sometimes you use functions that return errors to errno and will have to cope with that internally but can wrap the errors before returning them to the layer above.

All in all, I just recommend you to give a look to the GLib and GTK+ libraries. I learned a lot looking at their source code.

I've been wondering about this again lately. I recently discovered that the function "fopen" asserts if you pass a null pointer as the file path. That's a standard C library function -- and it's using assert. You could argue that input to a software library is different to input obtained outside the program (as in the corrupt document example given). Data from an external source should always be validated at run time. OpenOffice did the right thing in that example. But invalid input to a library API is the fault of some programmer, whether that be you or someone else.

A software library is intended for use by a programmer, and if that programmer tries to pass across unacceptable input, that's a programming error. In a language like C, without exceptions, aborting is probably the best way to make it obvious to the programmer that they messed up, as opposed to returning an error code that might not get checked. Imagine if "fopen" failed more quietly by simply return a null pointer, which is what I assumed it would do. I could have been left scratching my head as to why it wasn't able to open a file, which would have led to time spent debugging. The assertion failure drew my attention immediately to the mistake in my code. So I can appreciate how assertions can potentially save many hours of debugging effort.

I'm beginning to think it's acceptable to use them for public API's of software libraries. They should of course never be used behind a public API where the input is from an external source, such as a web API, as someone else has already pointed out. That would make it too easy for a hacker to crash your program simply by sending some data that it can't handle.

Licensed under: CC-BY-SA with attribution
scroll top