Programming Language Beauty: Look Closure

In the past year I have been passionately fighting what Simon Peyton Jones calls "the effects monster", although often it feels like I am fighting windmils instead. No useful programs can be written without effects, but effects turn bad when they are observable from within the program itself. Instead we should strive for encapsulating effects such that they become harmless first class pure values, but more on that in the future. In this first installment in a longer series on the perils of side-effects, we will look at one of the most beautiful examples of observable effects, namely closures and variable capture in imperative languages.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

A related discussion

A related discussion on LtU where variable capture and side effects caused confusion in Python.

Strict evaluation as an effect

In the abstract of the JAOO conference it says:

We argue that fundamentalist functional programming, that is radically eliminating all side-effects from our programming languages, including strict evaluation, is what it takes to conquer the concurrency and parallelism dragon.

Could give some pointers on strict evaluation as an effect? I'm surprised that the 'fundamentalist' argument is for lazy evaluation, rather than encapsulating potential non-termination.

Strictness

Have a look at the "partiality" example in section 1.4 of http://www.disi.unige.it/person/MoggiE/ftp/ic91.pdf or see an older discussion on LTU

I must be missing something,

I must be missing something, those examples are about partiality but not strictness. The next line of abstract says

We must embrace pure lazy functional programming "all the way," with all effects apparent in the type system of the host language using monads.

Is the case your making about pure functional programming or pure and lazy functional programming?

Do you mean that non-termination for all functions (lazy and strict) must be apparent in the type system? Do you think the evaluation order of any nonpure function should be apparent from the type?

If all non-termination is apparent from the type, then what difference does it make if the language is lazy or strict?

Just the wording

The wording of calling strictness an effect is strange. In a total, pure language you can't tell the difference between strict and lazy evaluation (except, of course, in time and space trade-offs).

It seems more correct to say that strictness is a side effect of having side effects. :-) In a language where side effects are central, strictness is one good way of keeping the programmer sane.

Strictness is an effect

Even better look at the "strictness" monad in section 3.2 of Wadler's classic "Comprehending Monads".

I still think it's partiality

I liked that paper a lot back when I first saw it. There's a lot of good meat in there. I had forgotten about his strictness monad. Thanks for pointing it out.

Anyway, I took a look based on your pointer and as far as I can tell the effect seems to be more partiality than strictness. E.g. the strictness monad he uses is based on the core function strict which is

strict f x = if x ≠ ⊥ then f x else ⊥

But if the language didn't have bot in the first place, then the function wouldn't even make sense.

Or, to put it another way, the fact that a lazy, pure language can use a simple function to force strictness (and the resulting possible non-termination) without "tainting" types would seem to indicate that whatever effect we're talking about must already be part of the language. I think that effect is partiality.

Proof by monad?

There's an identity monad too, I'm not convinced most people would call that an effect - perhaps its evaluation, again on grounds of partiality. As James points out, in a pure total language it's denotationally indistinguishable from a strictness monad - it'd be just an operational hint.

What is purity?

I'd have to agree wholeheartedly with Phillipa. It seems silly to argue that because something can fit into a monad, it is therefore an effect.

Most definitions are not monads: however, it seems most computational concepts can be shoehorned into a monad, somehow. Certainly not all of these concepts are effect-ful, the Identity being a trivial example. I'd bet there are more compelling counterexamples to that line of reasoning.

Also, not all effects are created equal. Some are highly pernicious, others are quite benign. In my book, concurrency is probably the most pernicious effect (with or without amplification by shared state), fresh name generation I'd rank among the least.

Even if you count strict evaluation as an effect, I'd have to put it solidly below even fresh name generation. At best, it's an effect-enabler.

When I got into FP, I formed the impression that purity was that: 1. evaluation order could only affect termination, not results, and 2. beta-reduction (i.e. subsitution) held true. Fresh name generation breaks #2, but not #1, whereas even message-passing concurrency breaks both. Strictness, on the other hand breaks neither, nor is termination a problem in practice as strict evaluation doesn't prevent programmers from writing programs that terminate.

I like Neel's argument that under liberal definitions of "purity" neither strict nor lazy evaluation is 'strictly' more pure than the other.

Effects make certain

Effects make certain observational equivalences false. If you add nontermination as an effect, you'll discover that call-by-value and call-by-name lose different equations.

Call-by-value preserves equations at positive types (sums), and call-by-name preserves equations at negative types (function types). (Pairs are both positive and negative, or rather strict and lazy pairs are two different types.)

In a call-by-name language, the following equation holds which is not valid in

  f = \x. f x   // extensionality for functions

This equation fails because in a cbv language, the rhs might diverge whereas the lhs never will.

However, in a cbv language, the following equation holds, which fails in cbn:

   f (match sum with Inl x -> e | Inr y -> e')
   ==
   match sum with
     Inl x -> f e 
   | Inr y -> f e'

Suppose sum is a divergent term, and suppose f doesn't use its argument. In Haskell, the first term could terminate whereas the second one will always diverge, making these two terms inequivalent. (In ML, if sum diverges, then both will diverge, preserving the equality.)

The belief that lazy+nontermination is "more pure" than strict+nontermination is a misconception. (Maybe it arose from too many semanticists failing to think about positive types like sum types? Category theory forces you to think about function types and products, but lets you avoid thinking about coproducts unless you want to....)

Paul Blaine Levy's work on call-by-push-value greatly clarifies all this from a denotational pov, and similarly the work on focusing clarifies this from a proof-theoretic pov. When you compare Levy's work on the jumbo lambda calculus with Zeilberger's account of higher order focused calculi, the similarities practically jump out and punch you in the face.

Symmetric Lambda Calculus

Andrzej Filinski's Master's thesis (ps.gz) discussing the symmetric lambda calculus is also relevant here.

Puzzled

Could you explain how can this expression diverge in a call-by-value language?

\x. f x

I am also curious, could we simplify your 'sum' example in:

f sum
 ==
let p = sum in f p

(I suppose not because the value of the right-hand side is not evaluated for let in a lazy language)

f == \x. f x


 f == \x. f x 

He probably meant that if f = _|_, then the equation fails (though lhs diverges, not rhs). Edit: Though that's true in CBV or CBN, so I'm not sure. The usual example is:

 K x y == x 

Eta abstraction.

As Neel said, it never diverges. :-)

Also, this is a very standard example, actually. Eta abstraction is a well-known idiom in Lisp and ML to make higher-order functions terminate. As a slightly contrived example, consider a general recursion combinator in OCAML:

# let rec fix f = f (fix f);;
val fix : ('a -> 'a) -> 'a = <fun>

# fix (function fact -> function n -> if n == 0 then 1 else n * fact (n-1));;
Stack overflow during evaluation (looping recursion?).

# let rec fix f = f (function x -> fix f x);;
val fix : (('a -> 'b) -> 'a -> 'b) -> 'a -> 'b = <fun>

# fix (function fact -> function n -> if n == 0 then 1 else n * fact (n-1)) 5;;
- : int = 120

Essentially, this is Y, except that Y is usually expressed using self-application instead of general recursion. However, self-application isn't expressible in most Hindley-Milner derived type systems.

Oops...

Oops, Neel did misspeak. I'm sure it was an honest slip of the tongue. I knew what he meant to say anyway. :-D

It is "folk lore" in

It is "folk lore" in Cybernetics, "Signals and Linear Systems" as in Gabel and Roberts, and Automata theory as in Hartmanis and Stearns "Algebraic Structure Theory of Sequential Machines" that closures are a mathematically correct way to encapsulate state. I say "folk lore" because I have not been able to find this written up anywhere, although it is used constantly. The basic idea seems to be SP partitions of states. Usually this is only applied to a small number of states but can also lead to the idea of a variable with an "attached function" in a closure, in other words an SP partition. I think that this is no more than an exercise for a properly trained mathematician, which I am not.

Many problems could be solved in this way. For instance systems theory has never had any need of the "anti foundation axiom". They use transforms such as Laplace and Z-Transforms to handle things that don't terminate (ie systems). Self reference is not really a problem it is a definition of a system.

Am I wrong? I have actually spent some time trying to sort this out without any luck.

Pure Math

Well, in pure mathematics, there is no such thing as state. You have to "emulate" state via explicit transition functions, or encapsulate this technique in a monad.

Pure math does make use higher-order functions, though there seems to be a mild cultural aversion to them among some mathematicians. Closures are one way to implement higher-order functions, but again, there is no "state" there.

Mutating the iteration variable

With respect to Erik's "Look Closure" bit, it might be worth pointing out that the given code, translated to Lua (syntactically very similar), does the "right" thing -- that is, prints out "456" in both cases.

The reason is that in Lua's "for" loop, the iteration variable is local to the loop body. Unlike C# or JavaScript, the iteration variable is never mutated; it is instantiated fresh for each iteration. It's similar to how you would do iteration in Scheme, where the loop body would be a closure and the iteration variable would be its argument.

Erik's piece is about variable capture, but capture isn't really the problem here. The problem is that a variable is being mutated after it is captured. I'm surprised that Erik didn't point that out explicitly, since variable capture is, in general, a fine thing for a closure to do.

Perhaps the real problem is that the C-like idiom for iteration uses mutable variables when they aren't really necessary. Or perhaps the problem is that this idiom simply goes against the expectation of the programmer, who thinks of iteration variables as local to the loop (probably because he only uses their value and never assigns to them), and thus expects their per-iteration value to be captured.

Mutation

Note that it is hard to avoid mutation in an imperative language. Even if you capture rvalues (as in Java), you can still wrap the integer inside a one-element array and mutate the contents of the array, while the captured variable is not mutated.

Variable capture or value capture ..

It looks like the problem you're trying to highlight doesn't even need a closure in the loop. You can simply stuff the iteration variable into an array within the loop, print out the array and the examples you give will behave the same way - i.e. 456 in some languages and 777 in some others. No?

(disclaimer: I'm more familiar with scheme than with the intricacies of javascript or c#)

Transforms

Yes, there is important differences with locals capturing in closures depending on the language. IMHO the Javascript way is very unhelpful since it disable any kind of loop variable capture.

In haXe, we are transforming the generated source in order to deal with it, so all platforms (Javascript, but also PHP, Flash and NekoVM) will get the same behavior, which is to capture the value depending on language scoping rules, and allow it to be shared+modified between several closures.

JavaScript loop variable capture possible

Well, it is easy to make JavaScript loop variable capture. How it can be done - left it as a puzzle to the reader. And it isn't even needed to use "eval". (Do no eval!).

very good

Eric is one of my favourite PL authors, I will follow his series with great interest...

So what you're saying is that...

... people who don't think of closures in their languages from the beginning tend to have a confusing, non-orthogonal closure mechanism when they get around to adding them (either by extending the language or via patterns)? I seem to remember a similar issue with Lisp back in the turn of the sixties, which was finally fixed in Scheme (and, to a lesser extent, in Common Lisp).

I am getting really tired of seeing this problem in language after language after language. The designer always says that real closures (a) are too expensive, (b) difficult to implement, and (c) are not used by anyone except geeky nerds. Of course, after the fact, people who actually use the language end up running into awful problems because they don't have the feature. Then the designer comes up with ugly workarounds or designs something that works half the time (because s/he still doesn't understand the issue).

Maybe the real take-away in all of this is that people who fancy themselves language designers should actually think about the semantics of scoping and variable capture when they initially design the language and not delude themselves that these "niceties" are unimportant. But then, I'm still waiting for pigs to grow wings.

Useful programs can indeed be written.

No useful programs can be written without effects...

This simply isn't true. I've written two compilers and a database which have no problematic side effects in the core code (that is, macro-observable. Linear objects, iteration variables, etc. aren't macro-observable). It takes a bit of practice, but it really isn't all that hard.

I think what was meant with

I think what was meant with that statement was that every useful program must produce output. Producing output is an effect, because it changes the state of the world. When looked at it this way it becomes a trivally true statement.

You yourself remark that there were no problematic side effects in the core code, but that is not really a refutation, because that implies that there were side effects in the non-core code.

I think your reading is

I think your reading is correct because the sentence continues ``effects turn bad when they are observable from within the program itself'', which further implies that there are benign side effects.

Nonetheless, I don't think it is necessary for a program to `produce output' via side effects. It's pretty easy to imagine a program that simply returns a number as a result. The number might encode nearly anything if it is big enough.