Why would a language NOT use Short-circuit evaluation?

https://stackoverflow.com/questions/1445867

22-07-2019
|

Question

Why would a language NOT use Short-circuit evaluation? Are there any benefits of not using it?

I see that it could lead to some performances issues... is that true? Why?

Related question : Benefits of using short-circuit evaluation

Solution

Reasons NOT to use short-circuit evaluation:

Because it will behave differently and produce different results if your functions, property Gets or operator methods have side-effects. And this may conflict with: A) Language Standards, B) previous versions of your language, or C) the default assumptions of your languages typical users. These are the reasons that VB has for not short-circuiting.
Because you may want the compiler to have the freedom to reorder and prune expressions, operators and sub-expressions as it sees fit, rather than in the order that the user typed them in. These are the reasons that SQL has for not short-circuiting (or at least not in the way that most developers coming to SQL think it would). Thus SQL (and some other languages) may short-circuit, but only if it decides to and not necessarily in the order that you implicitly specified.

I am assuming here that you are asking about "automatic, implicit order-specific short-circuiting", which is what most developers expect from C,C++,C#,Java, etc. Both VB and SQL have ways to explicitly force order-specific short-circuiting. However, usually when people ask this question it's a "Do What I Meant" question; that is, they mean "why doesn't it Do What I Want?", as in, automatically short-circuit in the order that I wrote it.

OTHER TIPS

One benefit I can think of is that some operations might have side-effects that you might expect to happen.

Example:

if (true || someBooleanFunctionWithSideEffect()) {
    ...
}

But that's typically frowned upon.

Ada does not do it by default. In order to force short-circuit evaluation, you have to use and then or or else instead of and or or.

The issue is that there are some circumstances where it actually slows things down. If the second condition is quick to calculate and the first condition is almost always true for "and" or false for "or", then the extra check-branch instruction is kind of a waste. However, I understand that with modern processors with branch predictors, this isn't so much the case. Another issue is that the compiler may happen to know that the second half is cheaper or likely to fail, and may want to reorder the check accordingly (which it couldn't do if short-circuit behavior is defined).

I've heard objections that it can lead to unexpected behavior of the code in the case where the second test has side effects. IMHO it is only "unexpected" if you don't know your language very well, but some will argue this.

In case you are interested in what actual language designers have to say about this issue, here's an excerpt from the Ada 83 (original language) Rationale:

The operands of a boolean expression such as A and B can be evaluated in any order. Depending on the complexity of the term B, it may be more efficient (on some but not all machines) to evaluate B only when the term A has the value TRUE. This however is an optimization decision taken by the compiler and it would be incorrect to assume that this optimization is always done. In other situations we may want to express a conjunction of conditions where each condition should be evaluated (has meaning) only if the previous condition is satisfied. Both of these things may be done with short-circuit control forms ...

In Algol 60 one can achieve the effect of short-circuit evaluation only by use of conditional expressions, since complete evaluation is performed otherwise. This often leads to constructs that are tedious to follow...

Several languages do not define how boolean conditions are to be evaluated. As a consequence programs based on short-circuit evaluation will not be portable. This clearly illustrates the need to separate boolean operators from short-circuit control forms.

Look at my example at On SQL Server boolean operator short-circuit which shows why a certain access path in SQL is more efficient if boolean short circuit is not used. My blog example it shows how actually relying on boolean short-circuit can break your code if you assume short-circuit in SQL, but if you read the reasoning why is SQL evaluating the right hand side first, you'll see that is correct and this result in a much improved access path.

Bill has alluded to a valid reason not to use short-circuiting but to spell it in more detail: highly parallel architectures sometimes have problem with branching control paths.

Take NVIDIA’s CUDA architecture for example. The graphics chips use an SIMT architecture which means that the same code is executed on many parallel threads. However, this only works if all threads take the same conditional branch every time. If different threads take different code paths, evaluation is serialized – which means that the advantage of parallelization is lost, because some of the threads have to wait while others execute the alternative code branch.

Short-circuiting actually involves branching the code so short-circuit operations may be harmful on SIMT architectures like CUDA.

– But like Bill said, that’s a hardware consideration. As far as languages go, I’d answer your question with a resounding no: preventing short-circuiting does not make sense.

I'd say 99 times out of 100 I would prefer the short-circuiting operators for performance.

But there are two big reasons I've found where I won't use them. (By the way, my examples are in C where && and || are short-circuiting and & and | are not.)

1.) When you want to call two or more functions in an if statement regardless of the value returned by the first.

if (isABC() || isXYZ()) // short-circuiting logical operator
    //do stuff;

In that case isXYZ() is only called if isABC() returns false. But you may want isXYZ() to be called no matter what.

So instead you do this:

if (isABC() | isXYZ()) // non-short-circuiting bitwise operator
    //do stuff;

2.) When you're performing boolean math with integers.

myNumber = i && 8; // short-circuiting logical operator

is not necessarily the same as:

myNumber = i & 8; // non-short-circuiting bitwise operator

In this situation you can actually get different results because the short-circuiting operator won't necessarily evaluate the entire expression. And that makes it basically useless for boolean math. So in this case I'd use the non-short-circuiting (bitwise) operators instead.

Like I was hinting at, these two scenarios really are rare for me. But you can see there are real programming reasons for both types of operators. And luckily most of the popular languages today have both. Even VB.NET has the AndAlso and OrElse short-circuiting operators. If a language today doesn't have both I'd say it's behind the times and really limits the programmer.

If you wanted the right hand side to be evaluated:

if( x < 13 | ++y > 10 )
    printf("do something\n");

Perhaps you wanted y to be incremented whether or not x < 13. A good argument against doing this, however, is that creating conditions without side effects is usually better programming practice.

As a stretch:

If you wanted a language to be super secure (at the cost of awesomeness), you would remove short circuit eval. When something 'secure' takes a variable amount of time to happen, a Timing Attack could be used to mess with it. Short circuit eval results in things taking different times to execute, hence poking the hole for the attack. In this case, not even allowing short circuit eval would hopefully help write more secure algorithms (wrt timing attacks anyway).

The Ada programming language supported both boolean operators that did not short circuit (AND, OR), to allow a compiler to optimize and possibly parallelize the constructs, and operators with explicit request for short circuit (AND THEN, OR ELSE) when that's what the programmer desires. The downside to such a dual-pronged approach is to make the language a bit more complex (1000 design decisions taken in the same "let's do both!" vein will make a programming language a LOT more complex overall;-).

Not that I think this is what's going on in any language now, but it would be rather interesting to feed both sides of an operation to different threads. Most operands could be pre-determined not to interfere with each other, so they would be good candidates for handing off to different CPUs.

This kins of thing matters on highly parallel CPUs that tend to evaluate multiple branches and choose one.

Hey, it's a bit of a stretch but you asked "Why would a language"... not "Why does a language".

The language Lustre does not use short-circuit evaluation. In if-then-elses, both then and else branches are evaluated at each tick, and one is considered the result of the conditional depending on the evaluation of the condition.

The reason is that this language, and other synchronous dataflow languages, have a concise syntax to speak of the past. Each branch needs to be computed so that the past of each is available if it becomes necessary in future cycles. The language is supposed to be functional, so that wouldn't matter, but you may call C functions from it (and perhaps notice they are called more often than you thought).

In Lustre, writing the equivalent of

if (y <> 0) then 100/y else 100

is a typical beginner mistake. The division by zero is not avoided, because the expression 100/y is evaluated even on cycles when y=0.

Because short-circuiting can change the behavior of an application IE:

if(!SomeMethodThatChangesState() || !SomeOtherMethodThatChangesState())

I'd say it's valid for readability issues; if someone takes advantage of short circuit evaluation in a not fully obvious way, it can be hard for a maintainer to look at the same code and understand the logic.

If memory serves, erlang provides two constructs, standard and/or, then andalso/orelse . This clarifies intend that 'yes, I know this is short circuiting, and you should too', where as at other points the intent needs to be derived from code.

As an example, say a maintainer comes across these lines:

if(user.inDatabase() || user.insertInDatabase()) 
    user.DoCoolStuff();

It takes a few seconds to recognize that the intent is "if the user isn't in the Database, insert him/her/it; if that works do cool stuff".

As others have pointed out, this is really only relevant when doing things with side effects.

I don't know about any performance issues, but one possible argumentation to avoid it (or at least excessive use of it) is that it may confuse other developers.

There are already great responses about the side-effect issue, but I didn't see anything about the performance aspect of the question.

If you do not allow short-circuit evaluation, the performance issue is that both sides must be evaluated even though it will not change the outcome. This is usually a non-issue, but may become relevant under one of these two circumstances:

The code is in an inner loop that is called very frequently
There is a high cost associated with evaluating the expressions (perhaps IO or an expensive computation)

The short-circuit evaluation automatically provides conditional evaluation of a part of the expression.

The main advantage is that it simplifies the expression.

The performance could be improved but you could also observe a penalty for very simple expressions.

Another consequence is that side effects of the evaluation of the expression could be affected.

In general, relying on side-effect is not a good practice, but in some specific context, it could be the preferred solution.

VB6 doesn't use short-circuit evaluation, I don't know if newer versions do, but I doubt it. I believe this is just because older versions didn't either, and because most of the people who used VB6 wouldn't expect that to happen, and it would lead to confusion.

This is just one of the things that made it extremely hard for me to get out of being a noob VB programmer who wrote spaghetti code, and get on with my journey to be a real programmer.

Many answers have talked about side-effects. Here's a Python example without side-effects in which (in my opinion) short-circuiting improves readability.

for i in range(len(myarray)):
  if myarray[i]>5 or (i>0 and myarray[i-1]>5):
    print "At index",i,"either arr[i] or arr[i-1] is big"

The short-circuit ensures we don't try to access myarray[-1], which would raise an exception since Python arrays start at 0. The code could of course be written without short-circuits, e.g.

for i in range(len(myarray)):
  if myarray[i]<=5: continue
  if i==0: continue
  if myarray[i-1]<=5: continue
  print "At index",i,...

but I think the short-circuit version is more readable.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow