"Semi-complete" Boolean Evaluation?

We have a new apprentice programmer here at work, and I was trying to explain to her the difference between complete and lazy boolean evaluation, and why she should be always be mindful of side effects when it's turned on or off. While she grasped the concept well enough, she confounded me in asking why the distinction exists in the first place (seeing as it creates side effects).

This got me thinking whether "semi-complete" boolean evaluation would be an asset to an imperative PL.

For example:


// Scenario A:
if (boolVar1) and (boolVar2) then
.
.
.
endif;


// Scenario B:
if (boolVar) and (someObjectInstance.someBoolMethod) then
.
.
.
endif;


//Scenario C:
if (someObjectInstance1.someBoolMethod) and (someObjectInstance2.someBoolMethod) then
.
.
.
endif;

In scenario A, completely lazy evaluation is the preferred way, because the expressions being evaluated are boolean variables, pure and simple, and the compiler should automatically enable lazy evaluation for that conditional.

In scenario B, someObjectInstance.someBoolMethod might do some internal processing that we want regardless, yet the ordering of the expressions determines that that might not happen if BoolVar is false and the expression is evaluated "lazily". This in some ways spoils the pure nature of boolean logic because expression order becomes important in determining the result, and switching lazy eval on/off becomes an issue.

Similarly, in scenario C, the same issues apply.

I realise that these issues can be avoided by reordering or splitting the conditional statements into multiple if/then's, but what about if the language, as a rule, ALWAYS evaluated expressions, if they were methods or function calls, regardless of where they occured in the conditional. For scenario B, for example, that would mean that even if boolVar was false, the someObjectInstance.someBoolMethod still gets called, except that the conditional statements below get skipped .

The overall effect would be that the language always presents complete evaluation to the programmer (who then does not have to factor in expression order), but can benefit from lazy evaluation if the compiler deems it appropriate.

It could perhaps also be possible to have a language construct to force the lazy evaluation, for example:


ifl (BoolVar) and (someObjectInstance.someBoolMethod) then...

or


if (BoolVar) and ?(someObjectInstance.someBoolMethod) then...

or suchlike, with the compiler giving appropriate warnings where neccessary.

I understand that allowing expression order to be an issue, is A Bad Thing regardless, to be avoided by careful code habits. I am also not neccessarily advocating the "semi-lazy" method, as I recognise how it introduces a new way of creating undesirable side effects (namely what happens inside the called methods), but I am nevertheless interested in hearing your comments on the topic.

Regards,
Riaan

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

The real question is if expressions should have side effects.

My opinion is that expressions should not be allowed to have non-local side effects. Only statements should have non-local side effects. If there was such a restriction, the issue of short-circuit in expression evaluation would not exist.

That's a good point.

Do you then consider it a good idea to disallow the use of methods and function calls in conditional statements (but which would require extra assignments), for example:


if (boolVar) and (someBoolFunction) then... // compile time error on method call

should be forced to

boolResult := SomeBoolFunction; // implicit side effect
if (boolVar) and (boolResult) then... // compiles ok, no side effects

That's what (good) type systems are for

Do you then consider it a good idea to disallow the use of methods and function calls in conditional statements

No need to be so stringent, just disallow use of any code with side effects in the condition. How to prove your code has no side effects? Use type system capable to describe referentially transparent code (Haskell is a good example of this approach).

Maybe a solution like mutable/const would be better.

I think it is a good idea to disallow function calls with side effects in expressions. But in order to be smart and efficient, a solution like mutable/const could be a better option, and a compiler flag to treat update of "mutable" fields in computations as errors would solve the problem.

Local non-static variables could be considered "mutable" automatically so as that local storage could be used to write an algorithm in imperative style that does not update the world, except its local variables.

"Complete" Boolean evaluation is silly

In almost all cases evaluating the unnecessary branch is either wasteful (it takes a long time), or incorrect (it would cause errors), or equivalent (it's a simple variable). The cases where this is actually intended are so rare that I'm happy with no explicit syntax for that, but introducing an auxiliary variable.

Programming language conjuncts are not commutative. Deal with that. This is more useful.

100%

I agree wholeheartedly. One of my biggest headaches in life is maintaining the code of the previous programmer who's position I filled when he left - the man had no concept of side effects in his code, he happily created an ENORMOUS mess of spaghetti and unknown program state by implementing and calling state-modifying global functions willy nilly. If he discovered that his program was breaking by calling some function, he'd happilly duplicate the "offending" function and modify the copy to suit...

Consistancy.

Even though it may seem simpler to do different things depending on the contents of the if statements, it is well known that consistency is easier to deal with than inconsistency (Boolean evaluation is lazy, except when...). Not only do you have to learn the rule (Boolean evaluation is lazy), you have to learn the exception(s) to the rule as well, and how those exceptions relate to the rule. In the end, I'm sure it will be much simpler for your apprentice programmer (and the rest of us) to learn the one rule, and be careful when evaluating calls with side effects than it would be to learn the rule and its exceptions.

Standard example: x=0 or log x < t

You start from your mathematics text that writes the test as log x < t

Then you realise that x=0 should pass the test, minus infinity will certainly be less than t, but the computer's logarithm function is only defined on stricly positive x, so you add in the extra test by hand. You are careful to put it first, so that it acts as a guard on log.

You really don't want any surprises in which x=0 and the test is obviously passed, but the computer applies an extra rule of interpretation and keeps going and tells 0 to an object that can't cope with it.

C like languages support both

C like languages actually support both styles. The && and || operators have shortcut/lazy semantics, while & and | always evaluate both sides. I don't think this is a very hard concept to grasp, not nearly as hard as remembering the precedence levels of && and ||.

C bitwise operators

The & and | in C are bitwise operations---they do not have the same semantics as && and ||. Specifically (1 & 2) is false.

bitwise and boolean

IIRC boolean-valued expressions in C evaluate to 0 for false and -1 for true. Using twos-complement representation of integers, & and | are closed over {false (0), true(-1)}. So the original assertion is correct. OTOH, anyone who obfuscates his code by entangling integer and boolean values in a single expression (not unheard-of among the C crowd) will likely reap the confusion he has sown.

C-like languages...

He did mention C-like languages, which could include both Java and C# - for which it does work.

It also works in C, if you stick with C99 bools (as for C++, I'm not up to recent standards). And it will also “just work” if you settle with (the sane practice of) using the canonical thruth value alone: !0 (which is 1, by the way, and not -1 as the above poster suggests). But yeah, it can lead to bugs in C (as can any/everything else :)

...more C trivia...

Unless the C99 standard has snuck something past me or the voices in my head are lying, boolean true is "non-zero"---the representation of true is up to the implementation. Plus, many functions that idiomatically get used as predicates return 0 for the normal condition (a successful operation, for example) and several different non-zero values for a "true" (an error return code, for example).

C is a very poor language, for a lot of applications. But it does have its good points, like making sloppiness painful.

I don't know C#, and try to stay away from this sort of thing in Java. Bitwise operations on signed numbers scare me.

Well, I might be mistaken...

... but, at least from this, I gathered “true” had to expand to 1 in C99. Also for “bool” typed variables, any value except false/0 is supposedly coerced to true/1. This doesn't mean the semanthics of if's, while's, for's, &&'s and ||'s have changed, though.

And about Java and C#, the &, |, ^ operators are well defined for (boolean, boolean) - you can't even mix booleans with integers. So, I fail to see what's scary there, except perhaps the similarity to their short-circuit equivalents (but should they really look diferent?).

Anyway I mostly only use ^ (in Java), as I have hardly ever any use for “complete boolean evaluation”. People still try to convince me to use != (and ==) instead though, I just really don't get why.

to avoid crashing

Another use case is to avoid the obvious NullPointerException (or crash, if you're in C++) in code like this:

if (trolls != null && trolls.size() > 0) ...

Ever use VB6?

' Shoot me now
If trolls Is Not Nothing Then
    If trolls.Count > 0 Then
        ...
    End If
End If

More generally:

The original post says, "sometimes an expression has side effects that we want". The general response is that you shouldn't be using expressions like that in boolean expressions, and I agree.

But there's another important point to be made here: "sometimes an expression has effects that we definitely don't want". While technically "side effect" = "state change", state changes aren't the only effect an expression could have. An expression evaluated under the wrong conditions might crash the program; result in an exit() or abort(); throw an exception; call a continuation that blows away the whole stack; or just consume an annoying amount of some resource (time, CPU, memory, network, etc.).

Short-circuiting operators are commonly used to avoid such unpleasantness.

This is especially the case in languages like C++ and Java, due to a specific language design choice in those languages: any pointer (reference) can be null, and chasing a null pointer (reference) is essentially always bad. Thus any method call could go horribly wrong and it's impossible to know it until runtime. Arguably this is bad language design, but that's how it is.

aka Command Query Principle

The avoidance of side effects even has a name, "Command Query Principle", coined by B. Meyer. Imho it's a good thing (tm).

This is not about lazy evaluation

Given that (x and y) == (if x then y else false) in your typical everyday imperative language, then all of your examples boil down to whether both branches of the if should be evaluated. So, this is the macro-expansion of and that explains why it does not evaluate all of its arguments.

Aziz,,,

// Scenario A:
if (x ? y : false) then
...
endif;

// Scenario B:
if (x ? f() : false) then
...
endif;

//Scenario C:
if (f() ? g() : false) then
...
endif;

But then how could you do it without resorting to macros?

In an eager language like Alice ML, you'd have to send lazy parameters

   fun andUser(x, y) = if x then y else false
   val cond = andUser(lazy true, lazy true)

Ada

Ada, by the way, provides both "and"/"or" and the short circut "and then" and "or else".

Re: Ada

Wow, that's really elegant and self-documenting; a nice touch!

I'll have to put Ada on my list of languages to look into.