Scoping based on control flow graph

The Anatomy of a Loop by Olin Shivers

…the lexical-scope rule of the classic λ-calculus, which we’ll call “LC scope,” implies an important property: it ensures that binders dominate references. The key design idea that serves as the foundation for our extensible loop form is to invert cause and effect, taking the desired property as our definition of scope. This gives us a new scoping rule, which we call “BDR scope.”

From this concept, Shivers builds a sub-language for describing iterations which includes what is effectively a directed graph of scopes, rather than a simple tree. Parallel code paths (or perhaps a better description would be unordered code paths) can independently contribute to the binding scope of following code, without being visible to each other. (See the example of quicksort on page 11.)

Is this too wierd for real programmers? Or is it a stunningly elegant way of capturing something important about parallel code flows? I vote for the second but I wonder what other LtUers think

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Here's what I think...

1. Olin Shivers is completely, unquestionably correct about unrestricted being really horrible from the perspective of writing well-structured code.

Why is this?

The reason, since I'm an FP guy, arises from the Curry Howard isomorphism, which says that in terms of logic, types are propositions and programs are proofs. In the specific case of recursion, a well-founded recursive function is a proof by induction. The kicker is in the phrase "well-founded": CH-style, every recursive call is equivalent to an appeal to the induction hypothesis, and for that to be a valid proof, it has to be called on a smaller argument than before.

Regular recursive programs give you no help at all in ensuring that your recursive program is only called at smaller arguments. You can write a correct program, but the language is of no use in ensuring it. Well, that sucks. But you're a badass functional programmer, so you think: maybe I can package up the recursion into a higher-order function, and then use that as a structured alternative to unrestricted recursion? You do so, and you find you've invented fold.

This is great, but there's a fly in the ointment. Inductive proofs all structure the induction slightly differently each time (there's induction, course-of-values induction, structural induction, and so on), and your HOF encodes exactly one such induction scheme. Contorting your code to make it fit can be unnatural. For an example, try to write a fold function that takes a list of strings, and adds commas between each element:

[]                    --> ""
["foo"]               --> "foo"
["foo"; "bar"]        --> "foo, bar"
["foo"; "bar"; "baz"] --> "foo, bar, baz"

etc...

You'll find that it's really ugly, compared with the natural ML solution:

let concat lst = 
  match lst with 
  | [] -> ""
  | [x] -> x
  | x :: rest -> x ^ ", " ^ (concat rest)

So the problem Shivers describe is real, important, and not easily dismissed.

2. However, I'm only partly convinced by his case for his loop language. The reason is that I think pattern matching gives you a simpler alternative. If you program with structural recursion -- that is, if every recursive call is to variables that were revealed in a pattern match, then the termination property is clear, and evident in the code. This even scales to handle "iteration-protocol" style operations, by writing a function that enumerates the elements as a lazy list.

Pattern matching is fairly complex in its own right, though in its favor it preserves a lexical scope discipline more naturally.

concat w/fold-right

Not that you're necessarily wrong in general, but is this so terrible: [in scheme]

;; ^ is the string-joining operator

(define (concat lst)
  (if (null? lst) ""
      (^ (car lst)
         (fold-right (lambda (str rest) (^ ", " str rest) )
		     ""
		     (cdr lst))
	 ) ) )

or, if we can use 'cut' to curry ^,

(define (concat lst)
  (if (null? lst) ""
      (^ (car lst)
	 (fold-right (cut ^ ", " <...>) "" (cdr lst))
	 ) ) )

and there are other options using pattern-matching to handle the (null? lst) case & still using fold-right for the recursion.

or, another option..


(define (concat lst)
  (fold-right (lambda (str rest)
		(^ str (if (equal? rest "") "" ", ") rest) )
	      ""
	      lst)
  )

(k first-solution)

(concat '("" "")) should be ", ", not "".

well the fold version isn't r

well the fold version isn't really that complex. Actually i think it's shorter but I wouldn't claim it to be more elegant, just that then you get used to it there isn't much that can't be done with a fold. So if you excuse my Haskel:

concat []  = ""
concat h:t = fold h t (\acc el -> acc ++ "," ++ el)

Haskell

Here's another way to solve this problem in Haskell:

import List

join joiner = concat . intersperse joiner
comma_join = join ", "

intersperse is written using recursion, but I can avoid both iteration and recursion in my code, and I like that.

On the original topic, the looping construct looks similar to Haskell's do. For example, Figure 1 could be expressed like this:

do x verbose code using x>
   guard $ q <verbose code using y>
   let z = <verbose expression>
   return <z expression>

I'm curious if the quicksort example can be expressed in Haskell in a similar way. Its destructive operations look tricky, but maybe it's possible for someone skilled with monads.

fold is simple, for some cases of simple.

I wouldn't claim it to be more elegant,

But for someone who understands what fold does, it is elegant. If we have an associative operator * (which is really a function S->S->S for some type S such that (a*b)*c = a*(b*c)) then fold just simply places that operator between all the elements of the list. In this case our operator (function) puts a comma between two strings ("A"*"B"="A,B", this is what the lambda expression does) and since ("A"*"B")*"C" = "A,B"*"C" = "A,B,C" = "A"*"B,C" = "A"*("B"*"C") we know that our operator is associative. Thus when we apply a fold on a list with this operator, we get the desired result.

What I'm trying to say is, for somebody that understands what fold does, the function that uses fold in its definition seems elegant and simple. For somebody that doesn't understand what fold does, it seems mysterious and complicated. On the other hand, any sort of loop definition will be simple to understand for both people.

Oh, and I forgot to say. I t

Oh, and I forgot to say. I think concatenating a list of strings with commas between each string isn't a very good example. I'm sure there are better examples of functions that are complicated for fold and friends but simple when defined in the "natural" ML (or whatever) solution. (I do have to admit the "natural" solution is more accessible, so I understand what Mr. Krishnaswami is trying to say.)

I think the reason for the ad

I think the reason for the advantage in "natural-ness" of the patterns+recursion style of definition becomes apparent when you try to prove the correctness of your functions: you'll find it easier to structure your program so that its structure parallels the structure of the proof.

This is because, when you write down a program proof, you want to structure it with two parts: a use of induction, and a case analysis with some specialized reasoning in each case. CH-style, use of an inductive argument corresponds to writing a recursive function, and case analysis corresponds to pattern matching. The correctness of the function depends on the patterns being exhaustive -- covering all the possibilities -- and for the recursion being well-founded.

Folds combine case analysis and recursion, which means that proofs of correctness for functions written using them are elegant if and only if both the recursion scheme and the case analysis of the proof match what the fold encodes.

Now, the advantage of folds over general recursion is that they are guaranteed to terminate. However, if you use a type system for checking termination, then I am not convinced that using elimination constants has any advantage at all.

Multiple recursion schemes

Inductive proofs all structure the induction slightly differently each time (there's induction, course-of-values induction, structural induction, and so on), and your HOF encodes exactly one such induction scheme.

In "Recursion schemes from comonads", Tarmo Uustalu, Varmo Vene, and Alberto Pardo show how you can use a comonad-parameterized fold to recover most of recursion schemes you mentioned.

Also freely available

But what about multiple parallel loops?

Let's say, as another simple example, that you want to output lines while cycling through a list of colours, something that comes up fairly often in HTML-generation, for example. Now, I'm just guessing how you would do it with Shivers' constructs, since there isn't a lot of detail about the precise semantics, but it seems like it would be something like this:

(loop
   (subloop (loop (for colour in-vector colours)))
   (subloop (for line in lines))
   (save (coloured-line line colour)))

Now, if I've got this right, the interesting part here is the scoping rule: colour and line are both in scope for the save clause, making the scopes somewhat non-lexical, but still entirely logical.

This is the part that I thought was either warped or elegant, but my vote is still for elegant.

For that example, I find "zip" more readable

Let's say, as another simple example, that you want to output lines while cycling through a list of colours, something that comes up fairly often in HTML-generation, for example.

In Haskell, I would do something like zipWith coloredLine (concat $ repeat colors) lines, and either map or fold the result.

Of course, that doesn't work so well without lazy evaluation.

Zippage

Of course, that doesn't work so well without lazy evaluation.

Even in a language like Python you can very easily build up an iterator algebra vocabulary using generators. It's a much less elegant approach than relying on lazy evaluation, but once you have built up that vocabulary it is very convenient to use.

Is this really about dynamic

Is this really about dynamic vs. lexical scope? Dynamic scope would seem to be a "control flow" scope? Common Lisp allows both types.

CFG scoping is lexical

I phrased that badly. It is not that the scope is non-lexical; it is that it doesn't correspond to standard lexical nesting. Although this is not a precise description, it is rather as though the scope works like this:

(loop
   (subloop (loop (for colour in-vector colours)    ⎫   ))
   (subloop (for line in lines)                     ⎭↴  )
 ↳(save (coloured-line line colour)))

That is, the lexical context splits between the subloops, but then rejoins and flows into the (non-contained) (save …). That mirrors the execution flow: the two subloops are iterated in parallel, and the remaining clauses are evaluated once both subiterations have come up with a binding, but it is not dynamic scoping: the scope is entirely static.

Iteration via delimited continuations

It should be mentioned that

(loop
  (for i in '(1 2 3))
    (for j in '#(4 5 6))
     (display (list i j)))

can be realized as

(loop
  (let ((i (enum-list '(1 2 3))))
    (let ((j (enum-vector '#(11 12 13))))
      (display (list i j))
      (newline))))

where

(define (enum-list l)
  (shift f (for-each f l)))
    
(define (enum-vector v)
  (shift f 
    (let loop ((i 0))
      (if (>= i (vector-length v)) #f
	(begin (f (vector-ref v i)) (loop (+ 1 i)))))))

and loop is just reset. It should be noted
how let plays the role of for. It is
possible to design a bit more complex realization of enumerators, so
that both nested and parallel loops become possible. Switching from a
nested loop to a parallel loop becomes the matter of switching the
top-level form from loop to loop-parallel,
preserving the body of the loop. Chung-chieh Shan suggested that
example.

Oh! is there an implementatio

Oh! is there an implementation of this loop-parallel avaible anywhere (accesable) ?

Any recent developments?

This paper has been on my mind again recently..

Does anyone know of more recent developments with respect to this? Is anyone using it to great effect, etc.