## Seeking suggestions on how to unify the environment, free variables, and current activation record for closures

In my spare time I'm trying to craft a language, mainly out of curiosity. I'm leaning towards a simple OO-language similar to Python, i.e. highly dynamic. My main goals for the language is simplicity and orthogonality, a rather secondary goal is performance.

I'm looking at unifying the mechanisms for accessing a value within the scope of a closure. Consider the following (in some arbitrary syntax):

class Foo:
m_x = 2

function f(x):
function g(y):
return m_x + x + y;


It's clear that g is a closure needing access to 1) It's own scope, 2) f's scope and 3) the environment of the instance. The straightforward implementation (which I have seen used in actual production languages) would be to have both closure and function types that upon invocation take whatever combination of environment variables and activation records they need. This implies that the invocation of functions and closures is different as well, and you sort of end up with a big heap of switches and what-if-this-is-a-this-type-of-callable code in your vm. Additionally, the vm ends up with a lot of different load instructions (load-from-stack, load-from-free-vars-stack, load-from-this), which makes anything but a stack-based vm untenable.

The only thing I can come up with is to treat an object and activation records as tables and link them up. A request for a variable can then trickle up these links of tables until a match is found. However, this sounds painfully slow even for the language I was envisaging.

I'm sure this kind of problem has come up and someone language implements this a lot more cleanly. Does anyone have any thoughts on the matter?

kind regards,
Tomas

## Comment viewing options

### Here's one way to do it.

First, decide that *all* functions are closures. That is, a function value at runtime is a pointer to a pair consisting of the address of its code and an array containing its captured variables.

Then, give your VM an "environment register", which holds the environment array for the current closure. When you make a function call, you 1) evaluate the head position to get the function object, 2) assign the function object's environment array to the environment register, and 3) push the return adress onto the stack, and 4) jump to the function object's code address.

Do that for everything. Compile global functions as closures which happen to have empty environment registers. To compile methods, treat a class as implicitly binding a self or this variable, and have your compiler expand all the instance and method accesses to explicitly dereference this variable:

class Foo:
m_x = 2

function f(x):
function g(y):
return this.m_x + x + y;


Now, if a method or nested function doesn't happen to refer to this, then it won't show up in its environment array because it's not a free variable. If it does, you'll capture it automatically.

This will maximize uniformity, though it's not the most efficient possible implementation.

### That's quite a nice way to do it

I've thought about what you said. As a consequence of what you're saying, a class and its members should be representable as a function with the member variables as local variables and all member functions as inner functions that close on the outer function's scope, thus capturing the member variables implicitly.

If there are no downsides to this scheme it might be preferable since captured variables could be implemented as an array lookup (similar to local variables), whereas in a duck-typed language lookups on objects (including, for simplicity, the this object) are always table lookups. I think doing so precludes dynamically modifying classes though, and it certainly requires you to have all parent classes available at the time of compilation which a lookup solution would not mandate.

What I am still vague on is the representation of the environment array for deeply nested closures. At the point at which you create a closure your total environment seems to be the sum of your own activation record as well as potentially N number of other activation records.

Eg:

function f(x):
function g(y):
function h(z):
return x + y + z;


Say we wanted to return h as a closure. It references free variables from f's and g's activation records. If it were a member function then it would additionally reference the quasi "activation record" of the object itself as per above. How would one close on the total set of those activation records without resorting to copying the values (which would be incorrect since you're closing on the variables, not the values), whilst keeping a single environment register? Would you link them up, and if so how would you access a particular value?

I guess you could have a VM instruction along the lines of:

LOADOUTER activationRecordIndex, index


Where you load free variable index from activation record activationRecordIndex if they are all linked.

I'm sure there's a better way to do this than what I can immediately come up with. Additionally, I'm slightly worried that if you close on entire activation records like this (as opposed to the few free variables you actually have), you artificially keep an excessively large number of objects live.

Any thoughts are muchly appreciated.

### Not as difficult a problem as it looks

I think this is described in any reasonable book on writing compilers, so I won't go into a lot of detail. You might also take a look at the archives of the venerable comp.compilers newsgroups, which is probably a better forum for questions of this kind.

The first thing is that you don't need to keep the entire activation frame around; you only need the bindings which are actually used in closed functions. If you do a little flow analysis, you will also be able to separate mutable bindings from constant bindings; the constant bindings can be copied if that is more convenient (and, if the constant value is known at compile time, can be eliminated altogether through constant-folding). This can considerably reduce the number of bindings you need to maintain.

The second thing is that you can always tell (at least with lexical scoping) how far up the linked chain of activation records you need to go to find a particular binding. So your VM instruction can include a pair <delta, index>, which means go up delta links, and then use binding index. That's very fast, particularly since the delta values tend to be small in common cases.

You're correct to worry about artificially maintaining too many live objects. Although I don't personally know of any language which does this, it would be possible to mark individual bindings in an activation record during garbage collection, instead of marking the activation record as a whole. A bit mask in each function closure would be sufficient to identify to the garbage collector which bindings were actually used by that function. Unmarked bindings would still occupy space in the activation record, but they don't need to be considered live.

For an alternative to this approach, you might want to take a look at the Lua virtual machine, whose architecture is well-described by its authors in this paper, which I believe has been discussed a bit on LtU. Lua separately heapifies each closed variable, and function closures are created with a vector of pointers to the heapified binding. That seems to work reasonably well in practice, but it could probably be optimised by doing more program analysis: in the particular case of instance bindings in an "object", for example, it is reasonable to assume that all instance bindings normally have the same lifetime as the object; that is, they are all live or all dead (perhaps with some rare exceptions). So it might be handy to have both individual bindings and vectors of bindings, instead of just individual ones.

Finally, I would urge you to avoid the lexical scoping design error in Python and Ruby ("local-by-default"). Use Scheme or Lua as a model for lexical scoping.

### ancient mental scars from reading Wirth's Pascal compiler

Finally, I would urge you to avoid the lexical scoping design error in Python and Ruby ("local-by-default"). Use Scheme or Lua as a model for lexical scoping

I think I would urge exactly the opposite, depending on what errors you mean. The simple statement "local by default" is, IMHO, exactly the right default for software engineering reasons, if you're talking about mutable variables.

As I understand it, Python and Ruby have two *different* "lexical scoping design" um, issues: in Python, you can only modify outer-scope variables if they're at the global scope and you specifically declare them to be "global" in the using scope. (You can get around this by using the "wrap it in an array" trick, so this is not really a prohibition, but rather a discouragement, adhering to the Zen of Python guideline that "Flat is better than nested".)

The error in Ruby might be this one: A block parameter named the same as an already visible variable will alias that variable rather than shadowing it. which, indeed, feels to me like a simple mistake.

If it is those two issues you're referring to, then I certainly agree about avoiding the Ruby one, though I disagree about using scheme or Lua as a model.

IMHO, allowing unmarked closures over distant outer-scope mutable variables is a software engineering disaster(*). It seems to me that mathematicians and implementers prefer it because it's a simple model that is relatively easy to implement. But when used out in the wild, it provides an additional opportunity for the initial program author to inadvertently hide important information from the reader of the program. That is, the writer knows when a free variable is being used (or modified!), but hides that bit of info by using an identifier that looks just like any local identifier. Once the functions get big enough, this makes them harder to understand.

Now, perhaps this is considered to be a feature that encourages programmers to make smaller functions, in which case I think it's a good goal, but a poor way to encourage it.

In my mental (i.e., hobby/vapor) language designs, I provide the feature, but insist that such variables must be marked so that the reader knows that they're coming from "outer space". My preferred syntactic marking for them is the back-quote () -- it's relatively unobtrusive and is rarely used in programming languages. I'm still undecided about marking outer-scope immutable variables in this or a similar way; and certainly standard library functions shouldn't need such markings.

(*) That's where the ancient mental scars came from -- trying to understand Wirth's Pascal compiler in the mid-1970's.

### I guess we all have our preferences

That's funny; I've always felt that Python's scoping rules are a back-handed way of encouraging smaller functions.

The problem with Python's lexical scoping, in my opinion, is that the scope of a variable extends to the entire function. In Lua, a local declared in a block (say, the then clause of an if statement) is local to that block; it does not interfere with a local anywhere else in a function. (The same is true of the control variables in for statements.)

In my Lua code, the declaration of a local is almost always immediately visible -- one or two lines above the use -- and the scope is almost always short. Since one of my maxims is "keep your scopes as small as possible", I will commonly throw a do...end block around the use of a temporary in order to avoid having the name present in the rest of the function. (And, similarly, my C code uses a lot of free-standing {...} blocks in order to limit the scope of locals.)

In Python you can't do that, which I sometimes find immensely frustrating -- and a source of annoying bugs.

I agree that mutable closed variables should be avoided whenever possible, but they are sometimes useful; good style would be to keep the definition as close as possible to the use and/or use a clear naming style, but languages cannot really enforce good style.

It's interesting to read section 4.1 (Naming and Binding) of the Python manual, which is a longish text including the following paragraph: (*)

If a name binding operation occurs anywhere within a code block, all uses of the name within the block are treated as references to the current block. This can lead to errors when a name is used within a block before it is bound. This rule is subtle. Python lacks declarations and allows name binding operations to occur anywhere within a code block. The local variables of a code block can be determined by scanning the entire text of the block for name binding operations.

and compare it to Section 2.6 of the Lua reference manual:

Lua is a lexically scoped language. The scope of variables begins at the first statement after their declaration and lasts until the end of the innermost block that includes the declaration.

It's possible that not only mathematicians prefer the simplicity of the latter. :)

(*) Section 4.1 of the Python manual is actually incomplete, since it does not allow one to predict the following behaviour:


Python 2.4.1 (#2, Sep 27 2005, 09:37:17)
[GCC 3.3.3 [FreeBSD] 20031106] on freebsd5
>>> for x in (y for y in range(1,10)): pass
...
>>> x
9
>>> y
Traceback (most recent call last):
File "", line 1, in ?
NameError: name 'y' is not defined


### I think I'd like this

* No implicit variable declarations, i.e. a = 2 does not create a variable if it does not exist - you'd have to do something along the lines of a <- 2, or perhaps var a = 2. I think that care of some of the confusion of shadowing outer variables.

* New blocks create new lexical scopes. Shadowing is implicit.

* Perhaps not explicit syntax for outer variables, such as the suggested a <- 'x where x is a free variables. This would be asymmetric with how outer variables are accessed for inner blocks, as well as for accessing member variables.

Basically, it's the coder's job to use closures responsibly and not to make a dependency mess out of it. IMHO, with bad coders you get bad code no matter how protective the language, and good coders tend not to like languages that get in their way too much.

### Hard binding closures ...

There's one technique I've used that I don't have a name for (maybe "hard binding"?), but works well - at closure creation time, create a code copy with all the free variables bound to whatever their value is at closure creation time. For example, when you call f(3), you get a closure whose code will read return m_x + 3 + y. This doesn't gel with the m_x actually, but works fantastically with pure functions.

This technique avoids any kind of symbol look up when a closure is invoked, except for its arguments and in that sense it is more efficient than the environment pointer technique.

### Of course, that doesn't give you mutable bindings

With such a scheme, you may mutate the object itself if the binding is by reference, but not rebind the variable. This makes closures a goodly bit less powerful.

A bit less importantly perhaps (depending on usage), the closure would be a rather heavyweight object as opposed to a couple pointers.

Apart from that it seems a fine scheme.

### Assignment conversion

Some scheme compilers do some simple analysis to find which variables in a closure might be assigned to. They can then convert these variables to references, and the previous scheme still works fine.

### boxing ...

While agreeing to most of rici's arguments for auto-boxing of captured variables, you sure can box your variable and use set! kind of functions to change the contents.

A straight forward way to box is to define your variable as a cons cell where the head of the cell gives the value of the variable.

Another way to implement a variable is to treat the "variable" itself as a closure where (var) gives you the current value of the variable and (var value) sets the value of the variable. This method has the advantage over the previous one that you can put validity checking into the closure - its sort of like the getXX() setXX() methods in Java. You can also extend it to do stream-like stuff - for example, you can implement (var val1 val2 ...) to mean "give me these values one after another every time I access the variable".

[edit: pardon me for the explicit lisping, but that's the clearest way to say it.]

### That only differs in a small detail

In general terms, a compiled function (or closure) consists of executable code, in some format -- perhaps a vector of [virtual] machine operations -- and a vector of constant values which are referred to by the executable code. What we might call early (rather than hard) binding of free variables consists of creating a closure with a precompiled executable body and a freshly-created constant vector. (Perhaps for efficiency we would have two constant vectors, one with the hard-coded constants and the other with just the free "variables", but that's an implementation-level detail.)

Now, suppose we want mutable, shared free variables. The only thing we need to do is to put a "box" into the constant vector. The box itself is constant, but it's a mutable object.

This is the underlying model for both Python and Lua; the difference is that in Python, it is the programmer's responsibility to create the boxes and to deal with the mutation and dereferencing of boxed values (generally using an array, as scruzia mentions above); in Lua, all closed values are boxed and the language takes care of mutation and dereferencing.

Forcing the programmer to use an explicit box has some appeal, particularly if you believe that state mutation is an evil which ought to be castigated. (It's hard to make this argument for Python, though, since mutable class instances are clearly encouraged by the language.)

The problem with explicit boxes is that they introduce into the language most of the problems associated with pointers; after all, that's essentially what a box is. In particular, it leads to aliasing, since a box can be assigned to a different variable and it then becomes impossible to find all the places where a box is modified through simple textual analysis. By contrast, in a language like Lua with implicit boxes, the name of a closed variable is an absolute, and a scope-aware grep is sufficient to find all the mutation points in the program.

The other problem with explicit boxes is that it hinders refactoring: changing a closed variable from mutable to immutable, or vice versa, requires wholesale alteration of the program: every use point and every mutation point must be altered.

A language with implicit boxes can do the mutation check at compile time; since aliasing is impossible, simple textual analysis is sufficient to determine whether or not a given free variable is mutated or not. If the variable is never mutated, then the compiler could choose between a variety of optimization strategies, including, as Srikumar Subramanian suggests above, partial evaluation. If the partial evaluation cannot be done adequately at compile time, it could also be done by a just-in-time compiler at run time (perhaps that is closer to Srikumar's suggestion).

Another possible optimization is combining several boxes into a single multi-valued box, which essentially leads to the chained environment vector solution outlines much earlier in this thread. Again, implicit boxing makes it possible for the compiler to use optimization heuristics to decide when coalescing boxes makes sense; in the case of "class" objects implemented as though the instance variables were simply closed free variables, the compiler might quite reasonably choose to jointly box all mutable instance variables, as I suggested in my first response.

It's not uncommon for languages to insist that the programmer at least declare whether or not a given variable is mutable, either by using a keyword such as constant or by using := instead of = in the declaration (or possibly other mechanisms). This allows the compiler to issue error messages when a supposedly immutable variable is accidentally mutated, but has few other virtues; it shifts the bookkeeping responsibility from the compiler to the programmer, simply creating a redundancy in the source code. If the language's syntax is reasonably straight-forward, the mutability analysis could be done by an IDE, for example by colour-coding mutable closed variables, providing the same feedback to the programmer as the compiler error message but, in my opinion, at rather less cost. A good compiler ought to optimise the use of immutable variables whether or not the programmer remembers to mark them as immutable, so there's no other advantage to forcing explicit declaration of immutable variables other than laziness on the part of the compiler-writer.

"... no other advantage to forcing explicit declaration of immutable variables ..."

IMHO, it distinctly improves the code's readability, especially if it's a relatively unobtrusive distinction like := vs = or var vs val. I tend to prefer to have such aids baked into the source code syntax because the availability of IDEs for the most interesting languages tend to lag behind the availability of compilers/interpreters for them. (Scala provides a nice counterexample to that, though.)

### True, it does aid readability

However, it doesn't aid readability as much as syntax-colouring in an IDE, because the annotation is at the point of definition, rather than at the point of use.

I'd personally prefer to have a language liberal enough lexically to allow me to use some punctuation in names, allowing the convention adopted by several languages that mutating methods have a bang in their name (the Ruby example of sort vs. sort!` springs to mind). That would make it possible to annotate the names of the few mutable free variables in your program in a similar fashion. (If I were dreaming of the perfect language, I'd also like the compiler and/or IDE to warn me about incorrect annotations, or even add them automatically to my code upon request.)

It's certainly true that IDEs lag behind language design, and that seems to me lamentable. There are certainly PLT researchers who have chosen to dedicate themselves to this problem, and there are some notable successes. But on the whole, IDE design happens after language design, rather than at the same time, sometimes with the result that particular language features -- notably difficult parsing problems -- make IDE design harder than should be necessary.

### explicit boxes...

The other problem with explicit boxes is that it hinders refactoring: changing a closed variable from mutable to immutable, or vice versa, requires wholesale alteration of the program: every use point and every mutation point must be altered.

Didn't quite get that.

If you made a variable mutable in the first place, it must be for a reason and in that case you should have at least one mutation point in your program. If you subsequently make the variable immutable, you have to remove that mutation point no matter which language you wrote your program in.

If you made the variable immutable in the first place and you now need to change it, then maybe you should consider an alternative design because you were able to get along without mutation until that point. That said, in this case you do benefit from auto-boxing.