How much redundancy/robustness should complex software implement?

https://softwareengineering.stackexchange.com/questions/19549

22-10-2019
|

Pergunta

The focus of this question: Some software performs "extra work" in order to increase the chance of a "eventually successful/satisfactory" outcome, despite one or more internal errors in the software, which requires a longer execution time when those errors happen. All these happen without the user's knowledge if the outcome was successful.

Definition of complex software:

Contains code written by (contributed from) more than 10 developers over its lifetime, and not written in the same time frame
Depends on more than 10 external libraries, each with caveats
A typical software task (for generating a result wanted by the user) requires 10 or more input parameters, where most of them have default values but are configurable if the user needs control.
Most importantly, software that has the appropriate complexity relative to the task being performed, i.e. not unnecessarily complicated.

Edited: What is complex? Please see There is a big difference between Complex and Complicated. (direct link)

Definition of Redundancy/Robustness within this question:
(Added Robustness based on comments)

If a software task failed when the current set of parameters was used, try different parameters.
- Obviously, there must be inside knowledge that those "different" parameters use a different code path, possibly resulting in a different (hopefully better) outcome.
- Sometimes these different code path are chosen based on observations of the external libraries.
At the end, if the actual task performed is slightly different from the user's specification, the user will receive a report detailing the discrepancy.
Finally, like the 10-plus configurable parameters, the redundancy and reporting are also configurable.

Example of such software:

Database Migration
- Business database
- Source control database, etc.
Batch converting between a Word document and an OpenOffice document, PowerPoint and OpenOffice Draw, etc.
Automatic translation of an entire website
Automatic analysis of software package, such as Doxygen, but where the analysis needs to be more dependable (i.e. not just a documentation tool)
Network communication, where packets may be lost and a number of retries are expected

This question was originally inspired from How do you deal with intentionally bad code?
but is now focused on just one of the causes of software bloat. This question does not address any other causes of software bloat, such as addition of new features.

Possibly related:

Solução

This is a business question, not a technical one.

Sometimes I'm coding with researchers or on a prototype, so we'll build something with very low robustness. If it breaks, we fix it. There's no point in investing in extra magic if we're going to throw away the code soon.

But if the users of your system need it to be robust, you should build it that way. And you should make it robust specifically in the ways that you and they need to maximize long-term success, while ignoring the kinds of redundancy/robustness that they don't need.

Generally, I start rough and then add robustness over time. I frequently make questions like this part of the normal planning process. I generally work in the Extreme Programming style, where we make a long list of desired features, and I put robustness features in there, too. E.g., "System survives the failure of any single box," gets mixed in with things like "User can join using Facebook credentials." Whichever one comes up first, I build first.

Outras dicas

Complex software typically is redundant as you probably know, but obviously not because that's the best way to do it but because developers tend to "tack on" existing code rather than attempt to understand in great detail how the software works.

However, if you asked me how much redundancy should be acceptable, I would say none whatsoever. Redundancy is one of many side effects of complexity, the archnemesis of simplicity. Arguably, simplicity should take a back seat if time is of greater importance, though I stress that those who argue that time is of the essence are rarely those who actually take care in developing software. It's usually your project manager egging you to just get the job done as soon as possible so you can get back to more pressing issues, however it's your duty as a programmer to know when the job is done. I think the job isn't done until you've successfully integrated it into the program as it was meant to be. Perhaps the program is complicated, but at least your modification fits the pattern making it somewhat easy to understand by a programmer used to seeing similar code.

However it should be said that in doing so, you may have to produce redundant code. If the project is already highly redundant, it may actually be more simple to continue the pattern assuming of course your boss doesn't have a couple weeks to kill allowing you to restructure the entire project.

Edit: In light of the rephrasing of the question, I'll add a bit about robustness. In my opinion, parameter checking should be done only if A) you accept a very specific format such as a date value as a string or B) various parameters potentially might conflict with each other or are mutually exclusive.

With A), the requirements that parameters match a specific format is usually critical to the necessity of the method (such as a conversion to a date from a string). Technically it could still happen in your program without it being a necessity but I would strongly encourage you to eliminate these possibilities and accept only parameters that you know represent the type of data you're looking for (Accept a date rather than a string if the point isn't to convert for example. If conversion must also be done, use a utility method to convert the string before passing it to the method).

As for B), mutually exclusive parameters represent a bad structure. There ought to be two classes, one which handles one case and another which handles it in another way. All common operations can be done in a single base class to avoid redundancy.

In situations in which the number of parameters to a method get to be 10+, I begin to consider a property file which contains all these parameters which are most likely not going to change often. If they do change, you can save the default in the property file and add a "setPropertyName()" method which lets you override the default at runtime.

Software should be forgiving of user mistakes, and completely intolerant of programmer mistakes.

Meaning, software should be very robust and allow smooth recovery for things like user input errors, and system configuration errors. At the very least a friendly error message stating where the error occurred (input box, config file, command line argument, etc...) and what constraint was violated ("must be less than X characters," "valid options are [X,Y,Z]", etc...) For additional robustness, the software can suggest an alternative, or use a reasonable default (but it should always indicate that it is not using exactly what the user specified).

I can't think of a lot of situations where an automatic retry with different defaults is warranted, but there are some (auto retrying to establish a communications link seems reasonable). I agree with @William that this level of 'redundancy' is a buisness decision.

On the other hand, there should be no run-time robustness against programmer error. If there are pre-conditions for a function's parameters, they should be checked with asserts, not run-time checks. It is a huge pet peeve of mine to see redundant errror checking and reporting on the same parameter three or four levels into the call stack:

 int A(int x)
 {
   if (x==0) return -1
   ...
 }
 int B(int x)
 {
   if (x==0) return -1
   err = A(x)
   if (err) return err;
   ...
 }
 // and so on and so on....

This is just additional unneeded complexity. You should not spend timing figuring out how one function should handle an error introduced by misusing another one. If this is the type of 'robustness' you are referring to - you don't need it. Replace it with asserts and thorough integration testing.

It's a requirements thing.

Is there a requirement for robustness ?

"When the communications link fails, erroneous packets are discarded" "When the link resumes operation, no transaction is processed twice"

there should be use cases for error recovery ( otherwise how do you know how it will happen ?)

Leaving it to the programmers to invent robustness as they go ( if required) results in "magical" systems.

All magical systems become crappy magic over time. Error correction in the background hides the occurrence of faults, so that faults will not be corrected, so the system will eventually degrade to a state of greater entropy; and run like crap, as it corrects for errors all the time. You must have a limit to stop the system entering a permanently degraded state.

Some operations probably warrant a "try-again" approach, especially if they depend on external resources like a database. For example, if a database cannot be connected to or a query fails, the operation may be retried a certain number of times before giving up and throwing an error to a higher level. However, in some logic, trying the same thing multiple times is often a symptom of bad code and magical thinking that hides real problems.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a softwareengineering.stackexchange