Is there a way to have four possible outcomes without repeating the two conditions? [duplicate]

https://softwareengineering.stackexchange.com/questions/355909

18-01-2021
|

Pregunta

In a generic sense, my question goes something like this:

if (p) {
    if (q) {
        a();
    } else {
        b();
    }
} else {
    if (q) {
        c();
    } else {
        d();
    }
}

There are four outcomes a(), b(), c(), and d() depending on the conditions p and q. Even though p and q will only be evaluated once in any case, q is still repeated in the code. I'm wondering if there's a way to avoid this redundancy.

In my specific case, I'm in a situation where one outcome is actually nothing happening, and another outcome is an additional thing happening. So it looks like this:

if (p) {
    a();
    if (q) {
        b();
    }
} else {
    if (q) {
        c();
    }
}

Is there any way to write this out so that p and q each only show up once? I'm usually good at rearranging things to avoid redundancy but I don't know if it's possible in this case. If it's not possible then is there some recommended way to phrase logical structure like this?

Here I'm using C# but I'll accept answers in any language.

EDIT: My question is certainly similar to How to turn truth table into smallest possible if / else block and I find its answers helpful, but my question looks to me to differ in the sense that I'm specifically asking how to avoid the redundant condition.

Solución

Yes, it's do-able, and may be a good idea

Certainly this is doable, and it can often yield clearer logic, depending on what parameters you are using. Hopefully the parameters are related in some way that ties into a well-understood business concept.

Identify the hidden business concept

In cases where the two parameters map to a single (hidden) business concept, you need to make that concept explicit, using a mapping function that takes p and q and outputs the case.

This example corresponds to your first example:

z = map(p, q)
switch (z)
    case z1:
       a();
       break;
    case z2:
       b();
       break;
    case z3:
       c();
       break;
    case z4:
       d();
       break;
}

Or for your second example:

z = map(p, q)
switch (z)
    case z1:
       a();
       break;
    case z2:
       a();
       b();
       break;
    case z3:
       c();
       break;
    case z4:
       //No operation
       break;
}

What is the mapping function?

The mapping function can be as simple as combining p and q into a bitmask (if they are Booleans) or combining strings (if they are, for example, product and subproduct codes). This could be a stretch in some circumstances, but is a plain and obvious solution in other circumstances. For example, consider these questions that may appear on an application for health insurance:

Are you married? Yes or no
Do you have children? Yes or no

In this case, your company might offer four products (the hidden business concept): single, married, single with children, and married with children. In this sort of scenario it is perfectly justified to combine the user's answers into a virtual product (given by the mapping function) and choosing a logic path based on the product instead of the individual flags. In fact it is much better code, IMO.

Getting rid of the switch case

Switch/case is ugly. Another approach would continue to use the hidden business concept but in a cleaner code pattern:

IProduct z = ProductFactory.GetProduct(q, p);
z.Execute();

...where the factory could return one of four products, all of which implement IProduct which contains a product-specific Execute method that will execute a(), b(), c(), and/or d() as needed.

Won't the factory have the same (original) issue?

The factory could contain nested if statements (bringing back the original problem) or it could use a lookup table for instantiation, like this:

class ProductFactory
{
    private static readonly Dictionary<bool, Dictionary<bool, Func<IProduct>>> _products  
                      = new Dictionary<bool, Dictionary<bool, Func<IProduct>>>();

    static ProductFactory()
    {
        _products.Add(false, new Dictionary<bool, Func<IProduct>> {
            { true,  () => new ProductSingleWithChildren() },
            { false, () => new ProductSingle() }
        });
        _products.Add(true, new Dictionary<bool, Func<IProduct>> {
            { true,  () => new ProductMarriedWithChildren() },
            { false, () => new ProductMarried() }
        });

     }
    public IProduct GetProduct(bool isMarried, bool hasChildren)
    {
        return _products[isMarried][hasChildren]();
    }
}

...where _products is a list of lists, each item containing a Func<IProduct> that will instantiate the appropriate type.

Dictionaries with a key of bool seem a little weird, so in an actual piece of code you might lean toward using an enum instead, or possible a string containing a code. I would suggest using an identifier that is aligned with the business/domain; make it easy for other developers to understand what is going on.

For completeness, here is an idea what the classes would look like:

interface IProduct
{
    void Execute();
}
class ProductSingle : IProduct
{
    public void Execute()
    {
        a();
    }
}
class ProductSingleWithChildren : IProduct 
{
    public void Execute()
    {
        a();
        b();
    }
}
class ProductMarried: IProduct 
{
    public void Execute()
    {
        c();
    }
}
class ProductMarriedWithChildren : IProduct 
{
    public void Execute()
    {
        //No operation
    }
}

You'll notice in the end we not only got rid of the duplicate if condition, there are in fact ZERO if statements in the whole solution!!! :) Pretty neat, and will get you a low cyclomatic complexity score and a nice clean CPU pipeline. Plus much more readable code.

Then again, we just wrote a LOT more lines of code than the original example. So, depending on context, maybe the nested if statements would be better, in certain cases, e.g. the combination of the flags is arbitrary or doesn't map to a logical concept within the business problem, or the extra readability isn't necessary for a certain small problem. It's up to you, of course.

Otros consejos

Short philosophical answer

It is tempting to present a nice trick. I will later in the answer. But I'd prefer to first address the root cause of your problem, even if you might not like it:

I'm usually good at rearranging things to avoid redundancy
- Kyle Delaney

Yes, certainly. But are you as good as your optimizing compiler? Let this kind of low level rearranging to your compiler and focus on the correct code. The compiler has much more trick, as it can work on assembler level.

Premature optimisation is the root of all evil
- Donald Knuth

Don't diddle code to make it faster; find a better algorithm.
- B.W.Kernighan & P.J.Plauger

Long answer: Yes it's possible but not necessarily better

I'll write in C++ as you accept solutions in any language.

Your initial code with 10 instructions will generate different code depending on how you initialize p and q, and what the compiler knows about a(), b() and c()

Here the assembly result if p and q are initialized with a random number. The compiler will keep them in registers as long as possible:

    call    rand         ;  p is returned in register eax
    mov     ebx, eax     ;  ultrafast move to another register
    call    rand         ;  so that q can be kept in register eax
    test    ebx, ebx     ;  check if p is null 
    jne     .L11         ;  if not null continue at .L11 to call a() etc...
    test    eax, eax     ;  here we're in the "else": check if q is null 
    jne     .L12         ;  if not null go to .L12 to call c() and finish
.L4:
    ...                  ; function exit code, not relevant for us
    ret 
.L12:
    call    c()
    jmp     .L4
.L11:
    mov     DWORD PTR [rsp+12], eax   ; save q : register might be overwritten
    call    a()                       ; execute a() 
    mov     eax, DWORD PTR [rsp+12]   ; restore q
    test    eax, eax                  ; and test if it's null
    je      .L4                       
    call    b()                       ; if it's not call b(). 
    jmp     .L4

As you see, there are two accesses to q as in the source code. No way round this, apparently. However, the code is mostly register based so very fast.

Now a variant, where p and q are initialized from the standard input. To do this, the address of p and q must be "leaked" to a library function, and once leaked, as the compiler doesn't know what happens in a(), b() and c(), it assumes the worst (i.e., that each of this function could maliciously modify the booleans reusing the "leaked" address).

Long story short, it'll be 17 relevant assembly instructions. Again, q will be read in two instructions. Unfortunately here p and q will be written and read from the stack memory, because of the above mentioned constraint. Memory access is slightly slower than register access. So this version will be less performant. But you see, the compiler takes a lot of things into consideration. WOuld you have thought about the risk of q being changed by a() ?

Finally, I propose a solution using a function pointer. It can probably be implemented in a similar manner in other languages that support lambda functions.

void (*x)() = c;   // function pointer pointing by default to c() function
if (p) {          // if it's p then we will execute a()
    a();
    x = b;        // and depending on q, we could have to execute b() later instead of c()
}
if (q) {
    x();          // call the function pointed to
}

I'm not sure that it will add to the readability of the code, but p and q are accessed only in one place. The code is also shorter: 6 instructions in the source code and 14 relevant instructions in the assembly. I could feel satisfied, but is it really a good idea ?

Unfortunately, the call to the function pointer will be slower than a direct call to the function address (because of the indirection). And the CPU will have to execute one or two additional statements to load and update the function pointer. So despite being of shorter size and nicely arranged, it will result in slower and less readable code.

Therefore, I'd strongly reiterate my first philosophical answer.

Yes, Simply concatenate P and Q and put the result into your conditional.

The question is why would you do this?

Licenciado bajo: CC-BY-SA con atribución

No afiliado a softwareengineering.stackexchange