Is there a way to have four possible outcomes without repeating the two conditions? [duplicate]
https://softwareengineering.stackexchange.com/questions/355909
-
18-01-2021 - |
Pregunta
In a generic sense, my question goes something like this:
if (p) {
if (q) {
a();
} else {
b();
}
} else {
if (q) {
c();
} else {
d();
}
}
There are four outcomes a()
, b()
, c()
, and d()
depending on the conditions p
and q
. Even though p
and q
will only be evaluated once in any case, q
is still repeated in the code. I'm wondering if there's a way to avoid this redundancy.
In my specific case, I'm in a situation where one outcome is actually nothing happening, and another outcome is an additional thing happening. So it looks like this:
if (p) {
a();
if (q) {
b();
}
} else {
if (q) {
c();
}
}
Is there any way to write this out so that p
and q
each only show up once? I'm usually good at rearranging things to avoid redundancy but I don't know if it's possible in this case. If it's not possible then is there some recommended way to phrase logical structure like this?
Here I'm using C# but I'll accept answers in any language.
EDIT: My question is certainly similar to How to turn truth table into smallest possible if / else block and I find its answers helpful, but my question looks to me to differ in the sense that I'm specifically asking how to avoid the redundant condition.
Solución
Yes, it's do-able, and may be a good idea
Certainly this is doable, and it can often yield clearer logic, depending on what parameters you are using. Hopefully the parameters are related in some way that ties into a well-understood business concept.
Identify the hidden business concept
In cases where the two parameters map to a single (hidden) business concept, you need to make that concept explicit, using a mapping function that takes p
and q
and outputs the case.
This example corresponds to your first example:
z = map(p, q)
switch (z)
case z1:
a();
break;
case z2:
b();
break;
case z3:
c();
break;
case z4:
d();
break;
}
Or for your second example:
z = map(p, q)
switch (z)
case z1:
a();
break;
case z2:
a();
b();
break;
case z3:
c();
break;
case z4:
//No operation
break;
}
What is the mapping function?
The mapping function can be as simple as combining p
and q
into a bitmask (if they are Booleans) or combining strings (if they are, for example, product and subproduct codes). This could be a stretch in some circumstances, but is a plain and obvious solution in other circumstances. For example, consider these questions that may appear on an application for health insurance:
- Are you married? Yes or no
- Do you have children? Yes or no
In this case, your company might offer four products (the hidden business concept): single, married, single with children, and married with children. In this sort of scenario it is perfectly justified to combine the user's answers into a virtual product (given by the mapping function) and choosing a logic path based on the product instead of the individual flags. In fact it is much better code, IMO.
Getting rid of the switch case
Switch/case is ugly. Another approach would continue to use the hidden business concept but in a cleaner code pattern:
IProduct z = ProductFactory.GetProduct(q, p);
z.Execute();
...where the factory could return one of four products, all of which implement IProduct
which contains a product-specific Execute
method that will execute a()
, b()
, c()
, and/or d()
as needed.
Won't the factory have the same (original) issue?
The factory could contain nested if
statements (bringing back the original problem) or it could use a lookup table for instantiation, like this:
class ProductFactory
{
private static readonly Dictionary<bool, Dictionary<bool, Func<IProduct>>> _products
= new Dictionary<bool, Dictionary<bool, Func<IProduct>>>();
static ProductFactory()
{
_products.Add(false, new Dictionary<bool, Func<IProduct>> {
{ true, () => new ProductSingleWithChildren() },
{ false, () => new ProductSingle() }
});
_products.Add(true, new Dictionary<bool, Func<IProduct>> {
{ true, () => new ProductMarriedWithChildren() },
{ false, () => new ProductMarried() }
});
}
public IProduct GetProduct(bool isMarried, bool hasChildren)
{
return _products[isMarried][hasChildren]();
}
}
...where _products
is a list of lists, each item containing a Func<IProduct>
that will instantiate the appropriate type.
Dictionaries with a key of bool
seem a little weird, so in an actual piece of code you might lean toward using an enum
instead, or possible a string
containing a code. I would suggest using an identifier that is aligned with the business/domain; make it easy for other developers to understand what is going on.
For completeness, here is an idea what the classes would look like:
interface IProduct
{
void Execute();
}
class ProductSingle : IProduct
{
public void Execute()
{
a();
}
}
class ProductSingleWithChildren : IProduct
{
public void Execute()
{
a();
b();
}
}
class ProductMarried: IProduct
{
public void Execute()
{
c();
}
}
class ProductMarriedWithChildren : IProduct
{
public void Execute()
{
//No operation
}
}
You'll notice in the end we not only got rid of the duplicate if
condition, there are in fact ZERO if
statements in the whole solution!!! :) Pretty neat, and will get you a low cyclomatic complexity score and a nice clean CPU pipeline. Plus much more readable code.
Then again, we just wrote a LOT more lines of code than the original example. So, depending on context, maybe the nested if
statements would be better, in certain cases, e.g. the combination of the flags is arbitrary or doesn't map to a logical concept within the business problem, or the extra readability isn't necessary for a certain small problem. It's up to you, of course.
Otros consejos
Short philosophical answer
It is tempting to present a nice trick. I will later in the answer. But I'd prefer to first address the root cause of your problem, even if you might not like it:
I'm usually good at rearranging things to avoid redundancy
- Kyle Delaney
Yes, certainly. But are you as good as your optimizing compiler? Let this kind of low level rearranging to your compiler and focus on the correct code. The compiler has much more trick, as it can work on assembler level.
Premature optimisation is the root of all evil
- Donald KnuthDon't diddle code to make it faster; find a better algorithm.
- B.W.Kernighan & P.J.Plauger
Long answer: Yes it's possible but not necessarily better
I'll write in C++ as you accept solutions in any language.
Your initial code with 10 instructions will generate different code depending on how you initialize p
and q
, and what the compiler knows about a()
, b()
and c()
Here the assembly result if p
and q
are initialized with a random number. The compiler will keep them in registers as long as possible:
call rand ; p is returned in register eax
mov ebx, eax ; ultrafast move to another register
call rand ; so that q can be kept in register eax
test ebx, ebx ; check if p is null
jne .L11 ; if not null continue at .L11 to call a() etc...
test eax, eax ; here we're in the "else": check if q is null
jne .L12 ; if not null go to .L12 to call c() and finish
.L4:
... ; function exit code, not relevant for us
ret
.L12:
call c()
jmp .L4
.L11:
mov DWORD PTR [rsp+12], eax ; save q : register might be overwritten
call a() ; execute a()
mov eax, DWORD PTR [rsp+12] ; restore q
test eax, eax ; and test if it's null
je .L4
call b() ; if it's not call b().
jmp .L4
As you see, there are two accesses to q
as in the source code. No way round this, apparently. However, the code is mostly register based so very fast.
Now a variant, where p
and q
are initialized from the standard input. To do this, the address of p
and q
must be "leaked" to a library function, and once leaked, as the compiler doesn't know what happens in a()
, b()
and c()
, it assumes the worst (i.e., that each of this function could maliciously modify the booleans reusing the "leaked" address).
Long story short, it'll be 17 relevant assembly instructions. Again, q
will be read in two instructions. Unfortunately here p
and q
will be written and read from the stack memory, because of the above mentioned constraint. Memory access is slightly slower than register access. So this version will be less performant. But you see, the compiler takes a lot of things into consideration. WOuld you have thought about the risk of q
being changed by a()
?
Finally, I propose a solution using a function pointer. It can probably be implemented in a similar manner in other languages that support lambda functions.
void (*x)() = c; // function pointer pointing by default to c() function
if (p) { // if it's p then we will execute a()
a();
x = b; // and depending on q, we could have to execute b() later instead of c()
}
if (q) {
x(); // call the function pointed to
}
I'm not sure that it will add to the readability of the code, but p
and q
are accessed only in one place. The code is also shorter: 6 instructions in the source code and 14 relevant instructions in the assembly. I could feel satisfied, but is it really a good idea ?
Unfortunately, the call to the function pointer will be slower than a direct call to the function address (because of the indirection). And the CPU will have to execute one or two additional statements to load and update the function pointer. So despite being of shorter size and nicely arranged, it will result in slower and less readable code.
Therefore, I'd strongly reiterate my first philosophical answer.
Yes, Simply concatenate P and Q and put the result into your conditional.
The question is why would you do this?