Generators/Iterators and lazy evaluation

Hi all

I've been using OCaml for a year or so building an interpreter/compiler for the Python programming language as a research project.

It's worked pretty well for most part, but I'm trying to implement some of the features of Python involving 'lazy' features, such as generator functions and list comprehensions.

A generator in Python can be thought of as an arbitrarily generated 'lazy list'. As an example
the following is a generator capable of generating powers of 2 upto a max value.

def pow2(max):
  start = 0
  while (start .lt. max):
    yield 2^start
    start += 1

The 'yield' statement is the point where the function returns the next value and suspends itself until the next time it is 'forced'. At that time it resumes execution where it left off.

OCaml makes this particularly hard to implement this due to lack of 'control flow' features, including even a call/cc. The only way I can see right now is breaking this code down into little functions and keeping track of the right function to call the next time.

I've been thinking about whether I can express this in some elegant way in some lazy construct in OCaml, since this is basically a form of 'lazy' evaluation. However, I come from the C world and am not quite used to 'lazy' thinking :) Any ideas?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

One word

Delimited Control

Try the libraries on Oleg Kiselyov's
Continuations and delimited control page. If you want to compile your interpreter you'll have to use the monad, but it's often nice to encapsulate interpreter state in a monad anyways.

Tricky

There are two basic styles of implementing such a construct:

  1. Supporting continuations, and building yield with them. It’s enough to have continuations which are applicable only once.

    Pro: it’s fully general: it allows to yield from within any language construct used in a function, even from nested functions.

    Con: implementing continuations affects the function calling convention and can’t achieve all of: efficient, portable, applicable to foreign language code.

  2. Transforming the function containing yield to a state machine which performs one step at a time, from a yield point to the next one. The stack frame is converted to an object with fields, so it can persist between invocations (this one is easy with Python because it already allocates stack frames separately; C# needs more work).

    Pro: it’s a local transformation, doesn’t require any special capabilities of the function call mechanism.

    Con: yield must appear only on the top level of a generator function, among code which uses a single stack frame; in particular you can’t yield from within a local function passed to a higher order function, even if it merely implements a loop; special code must be generated for all language constructs which may be crossed by yield.

Python and C# use the second approach.

Streams language extension

With the streams language extension (comes with the Ocaml distribution) you can do this:

let pow2 high =                            
  let rec aux at =                         
    if at < high then
      (* ocaml doesn't have pow for integers.. *)
      [< '1 lsl at; aux (at + 1) >]
    else
      [< >]
  in
    aux 0

let main () =
  let p = pow2 5 in
  let rec loop () =
    match p with parser
     | [< 'n >] -> print_int n; loop ()
     | [< >] -> ()
  in
    loop ()

let _ = main ()

Such structures can be tedious to use in certain cases, though, and maybe this is what you mean by breaking the task into many functions.

Call/cc-libraries for ocaml do exist. I don't think you can use them in native binaries without using monads, though.