Best maintainable evaluation strategy?

Lets face it: Almost all code needs to be maintained, meaning it needs to be fixed and/or extended (Except TeX, but even that needs to be ported to new languages/environments ;)). In order to be maintained, code needs to be understood. So what is the evaluation strategy that you would think is best maintainable? I vote for call-by-value.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Efficiency

How efficient is call-by-value when dealing with infinite data structures, or distributed computing?

Is scalability important for maintainability (and understandability)? I think so, if the code may need to be scaled in the future.

Would call-by-name/need be a better strategy for scalability without sacrificing understandability? Maybe not, since it is less predictable. Other suggestions?

Scalability is an issue when

Scalability is an issue when maintaining code, but in my opinion not as much as understandability. It should be as obvious as possible how things are evaluated. I am just not sure which of the evaluation strategies is the most obvious.

call-by-name is in my opinion rather hard to read, because you need to read a lot of code, until you understand a simple function call: The context in which the function is called and the context in which the stuff is "replaced" inside the function. Or am I wrong?

I find the opposite, as it

I find the opposite, as it more readily allows for layering of abstractions, so some developers only need to deal with things at the business logic level, rather than at the bit-twiddling level. Similarly, this leads to better code reuse and in turn better maintainability.

The issue with this is when developers who do understand the bit-twiddling level leave.

call-by-name

When call-by-name or call-by-need is applied in a sane way, expressions (regardless of when evaluated) are evaluated w/r/t the environment where they appeared.

To do otherwise is to risk inadvertent variable capture among other things, leading to things like the 'upward funarg problem' in old Lisps.

Using the environment where the expression appeared limits the complexity of understanding a function call to just two things beyond eager semantics; was the argument evaluated at all, and (if it's a call-by-name language and the argument evaluation has side effects) how many times?

Lexical scoping?

When call-by-name or call-by-need is applied in a sane way, expressions (regardless of when evaluated) are evaluated w/r/t the environment where they appeared.

Isn't that just lexical scoping? Are you saying that call-by-value doesn't enjoy this property?

Not at all

Consider this function: function foo(a, b) { a++; print(b)}. If we call it by value as foo(i, a[i]), then it prints the ith element of a using the values of i and a in the calling environment. Using call by name, however, it prints the i+1th value of a.

Ray is arguing against call by name systems where this sort of thing is permitted.

Mutable references

So you think Ray is making a distinction that only applies to languages with mutable environments? (i.e. Variable bindings can be mutated, rather than just cells?) Your comment leaves me even more confused, because I had thought Ray was arguing in favor of call-by-name.

In favor of *sane* call-by-name

that is, where the value of a CBN argument depends only on what happens before the point of call. This eliminates some of the famous old CBN hacks, like this General Problem Solver in Algol 60:

real procedure GPS(I, N, Z, V);
   real I, N, Z, V;
   begin
      for I := 1 step 1 until N do Z := V;
      GPS := 1
   end

the point being that the argument expressions corresponding to N, Z, and V may very well contain the variable corresponding to I, and therefore this procedure can compute any function over the reals by passing appropriate arguments. (Note that the statement "GPS = 1" effectively makes the procedure return 1.)

This procedure does indeed depend on mutability. There may well be cases that don't depend on mutability, though I admit I can't think of any.

Yes, it is...

Yes, I'm talking about lexical scoping. Yes, call-by-value is even simpler. Call-by-value does not rely on *any* environment, because all argument evaluation happens before the function is even called. All you pass is values.

My point was that there is a broad continuum of argument evaluation strategies, and that you can have lazy or call-by-need evaluation without being required to admit really hopeless amounts of hair.

Really hopeless, unmaintainable amounts of hair is what happens when the callee has access to the caller's environment and can make arbitrary changes/bindings/mutations on things that aren't even being used as arguments. Especially when the callee might do so *accidentally* just because the caller happens to have a variable with the same name as a different variable which the callee expects to use.

Your example of calling

foo(i++ arr[i])

brings up several interesting corner cases. Even in an eager (call-by-value) language, the second argument will depend on which order the arguments are evaluated in, because the first expression changes the binding of i. If your eager language allows argument evaluation in either order, then this is ambiguous.

In a lazy (call-by-need) language, you have the additional possibility that one or both of these expressions won't even be evaluated at all, depending on whether the callee needs their values or not. You also have the possibility that, even if both arguments are evaluated, they might be evaluated in either order depending on *when* the callee evaluates them. And that may or may not be controllable via the source code of the callee.

In a call-by-need language with optimistic argument evaluation, the runtime is free to evaluate the arguments, even if it does not need their values. This permits clever compiler writers to optimize but removes all control over the order of evaluation from the program.

In a call-by-name language, you have the possibility that one or both of these expressions may be evaluated some number of times that you can't control from the caller. Depending on what foo does this may be an idiom for stepping through the array and doing something with all values, or something like that.

Each of these represents some additional hairiness in terms of maintenance, but are still maintainable and within what I'd consider a "sweet spot" for a general-purpose language. But each of them becomes an order of magnitude worse when some rule other than lexical scoping is used.

Who says TeX code doesn't need to be maintained? :-)