Publishing negative results: single-assignment Lisp

For the past few months I've been working on a language that I thought of as a hybrid of Scheme and Erlang. Lisp-1, non-hygienic macros, message-passing concurrency, implicit parallelism on function args, single assignment, no mutable values. Mostly I'm pleased with what I've learned from it, but today I realized that I couldn't see it as a usable language.

The problem is that single assignment prevents the user from redefining functions at the REPL. Or, rather, you can do it (it creates a new environment containing the new binding), but it doesn't affect the functions that use the old binding. I don't have a problem with that in ML; but I just couldn't see myself using a Lisp that didn't let me redefine functions.

I just wanted to post about it, on the theory that we don't see enough people talking about their failures. Maybe somebody else will see this and avoid the trap I painted myself into; or maybe they'll be inspired to find a way I didn't think of.

Oh, and why single assignment? Because it means that environments are immutable, too—the only time you get to bind is when you create a new environment, with (let)—which means that environments can safely be shared across processes, which makes processes cheaper. It also means that it's relatively safe to evaluate function arguments in parallel, since there are no assignments to get out of order. (It's only relatively safe, though, since message send/receive could get misordered.)

One decision that I'm still pleased with: I disallowed arbitrary cons cells; all lists are proper lists, as in, say, ML. I'm of the opinion that Lisp constructs such as (1 2 3 . 5) are more trouble than they're worth; every time you iterate over a list, you have to decide whether to check for improper lists or just have (car) throw an error. Also, when all lists are proper, and immutable, more optimizations become possible, such as, under the hood, using an array instead of a linked list.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Strange

This is strange, because a language such as Haskell has a REPL-like interface, and yet is still useful. It embeds the REPL in a monad, thus allowing redefinitions. Perhaps look into Haskell's monadic approach before you give in on this idea?

Not the Haskells I've seen

Haskell's REPL doesn't allow redefinition, either, though—since it doesn't allow definition. At least, hugs and ghci don't: defining a function or a constant yields a syntax error such as "(unexpected `=')".

ML (at least all the versions I've tried) does allow function definition at the REPL, in the same way my language does; but the lack of true redefinition doesn't feel like such a problem, there. That might just be because I haven't used ML for anything large yet—since I started on my language, I've started working at ITA, where I'm using Common Lisp heavily for the first time, and I've come to appreciate how useful redefinition is for debugging.

Redefinition in haskell

Haskell's REPL doesn't allow redefinition...

The repl is running inside the io monad, which means you can create let bindings

Prelude> let sum = undefined

However let bindings are lexically scoped so redefining functions has no effect on previous uses.

Prelude> let mean xs = sum xs `div` length xs
Prelude> let sum = foldl (+) 0
Prelude> mean [1,2,3]
*** Exception: Prelude.undefined

Hmm

...OK, that does work, in ghci (not in hugs). I didn't know about that use of let; I was doing "f x = x+1", as in a module, rather than "let f x = x+1".

As you say, though, it doesn't have anything to do with redefining.

Sorry...

Sorry about misinterpreting the meaning of "redefinition". I suppose redefinition of functions (meaning that you want to change the meaning of already compiled code) would require either a) that all function calls are indirect or b) that static references to the functions are patched. The first solution is certainly doable, with a slight performance overhead - whereas the second solution seems messy. Either way, it seems to me that both schemes are useful. You might want to keep the old functions intact until exploration of new functionality has been tested. At that point you might want to discard the new functions, or else inject the new definitions into the running system.

I can not agree

I always thought of the possibility to redefine something in REPL as of a hack that should not normally be used. So, I never designed and implemented a language which allows such a hack. And, I never missed this feature in OCaml REPL and GHCi as well.

The only real practical application I can think of is something like SISCweb, where REPL and runtime code patching proves itself quite useful for development and debugging. But, it is always possible to achieve the same with more transparent approach: use explicit hooks. REPL itself is what really rocks, not a code patching.

Debugging

runtime code patching proves itself quite useful for development and debugging

Yes, debugging is where redefinition comes into its own. I change a function and hit C-c C-c, and I can test the new definition without restarting (which can take a long time with what I'm working on).

Using explicit hooks would be an alternative; but I'd have to add the hook call, then restart the process, then remember to remove the hook call before checking in (leaving the hooks in place for every function that's ever been modified would be a bad idea for performance).

As for debugging, it is easy

As for debugging, it is easy to toggle a hook wrapper using macros. E.g., in my Lisp implementation (MBase), I'm using simple macro (debug: ... defns ...), which wraps all the function definitions inside into sequences like (begin (function $name $args "original function" ...) (define $hook (mkref $name)) (function $name $args "wrapper function" ((deref $hook) . $args)))

It is also possible to switch this behaviour on and off for everything by redefining (function ...) macro itself.

So, if a lisp implementation provides macro metaprogramming, it is always possible to make it patchable in runtime for debugging, without performance tradeoffs in production build.

Still doesn't work

(a) In a single-assignment world, hooks don't work, because you can't modify the value of the hook.

(b) Given your (debug) and modified (function) macros, you've got redefinition. The fact that your interpreter doesn't know about it is a red herring; it'll provide the same opportunities for bad code as if it were built in.

Why? It works well: a) hook

Why? It works well:

a) hook is a mutable reference wrapper (like ref in OCaml). It can be internally implemented as (cons something nil), deref as (car ...), and update as (set-car! ...).

b) It is not a redifinition, since the original function name could be mangled, only a wrapper function (which calls a hook value) is exposed. It does not affect neighter performance (compiled functions refers to other functions directly, and even inlining won't affect this behaviour) nor a possibility to debug (just replace the hook value, and a wrapper function will call the new one).

The only constraint here is that you can't change a function signature, but I suspect you'll never want to change a number of arguments for a function which is called from somewhere else.

(a) I don't have any mutable

(a) I don't have any mutable references; I have no mutable values at all. In my case, the whole point of single-assignment is that there is nothing mutable, so that separate processes can see the same data, without fearing that it's going to change.

(b) I'm not going to argue about.

debugging

An interesting question to ask: should debugging be considered as interaction/extension within a system, or supervision/modification of a system by some meta-level?

You might consider the combination of single assignment from within the language, but the ability to inspect or arbitrarily change the state (including reassignments) from the debugging meta-level. Then the question becomes: how is the system represented abstractly as a term in the debugging meta-level? This is essentially a matter of the debugger's user interface.

Good point

Interesting. I've been assuming I had to work within the capabilities of the language, because I was writing the REPL in the language; but you're right, it doesn't have to be that way.

And it would be possible to write the REPL in the language extended with debugging primitives, as long as the code the REPL executes is run in an environment that doesn't have those primitives.

lexically-scoped dynamic binding

I'd recommend looking into lexically-scoped dynamic binding. I'm currently working on a paper on this topic. The idea is you have standard "let x = e1 in e2" bindings and you have dynamic override bindings "dlet x = e1' in e2", which temporarily locally update the value of an existing lexical variable x. This is not entirely new: fluid-let and parameters in Scheme work similarly. For example:

let x = 1 in
let f z = x in
dlet x = 2 in f 0

would evaluate to 2 (it would evaluate to 1 if the dlet were changed to a let). But:

let x = 1 in
let f z = x in
let x = 3 in
dlet x = 2 in f 0

would still evaluate to 1, because the x inside f isn't the same x as redefined by the dlet.

If you define your REPL so that new definitions are dlet statements with scope the rest of the interaction session, you're part of the way there. If you can pass function closures in messages sent to other processes, then you can explicitly choose to import definitions: the sender includes a function (call it D) that introduces some dynamic bindings and then calls a thunk within that context. The receiver packages its body as a thunk and passes it to the dynamic context creation function D included in the message. You could probably use macros to introduce some syntax that makes this pattern more succinct.

If dynamic rebinding of arbitrary lexical variables doesn't meet your performance objectives, you might find it useful to mark which variables can be overridden, for example by using something similar to Scheme's parameter objects.

Interesting, but...

An interesting idea, but I'm not happy with the idea that any variable can be dynamically overridden. In the example you give, it's not possible to look at f and know what the value of x is, or where it's defined.

I'll have to think about it. Maybe if it applied only to variables that had been declared to be dynamic in the first place (as in Common Lisp's (defparameter)).

Limiting dynamic overrides

Differentiating between dynamically overridable and non-overridable lexical variable bindings is certainly a reasonable idea worth considering. Here are some other, related ideas:

One of the interesting things about lexically-scoped dynamic variables is it's easy to limit dynamic overrides of variables without adding any additional language constructs. Simply define another lexical variable with the same name, and the previous one is shadowed and protected from being overridden by code in the inner scope. Example:

let x = 1 in
let f z = x in
let x = x in
dlet x = 2 in
f 0

will evaluate to 1, because the "dlet x = 2" overrides a new lexical x, which happens to have the same initial value as the original x.

This is different from defparameter/defvar in that Common Lisp's dynamic variables have no lexical scope restriction, so you can't do the above sort of thing to protect dynamic variables from being redefined.

Parameter objects as in Scheme (SRFI 39) also allow you to limit which variables are dynamically overridden, at the cost of having to explicitly dereference the value of the parameter, rather than using it like a normal variable (in Scheme, this is simply putting the parameter in parentheses to invoke it as a nilary function).

late binding?

Your mix of languages features is very similar to what I would like as a language design. Particularly single assignment scheme producing new environments with each assignment. However I was not interested in having macros and instead I wondered about integrating a javascript style syntax to access those environments.

Essentially the environments would take the place of the prototype object chains in javascript and I would provide for javascript dot syntax. If dot syntax was used, eg. object.name then this would invoke a late binding dispatch in the most recent environment. The details at this point are hazy but in a sense its like the objective C idea of embedding a dynamic runtime with its own syntax support. Symbol names without a dot are lexically bound as per scheme.

Obviously this is a departure from what you had in mind but I wonder if you think such a combination might work around your current limitations.

Interesting, but not a solution for me

That sounds like a good language. I've actually thought about using environments as objects before. The standard OOP distinction between objects and lexical environments is kind of artificial, after all; why have two sets of rules for looking up bindings when one will do? Your approach would obviously support prototype-based inheritance ("the prototype of this object" translates directly to "the parent of this environment"), and could be extended easily for class-based inheritance (as in various JavaScript libraries).

If I were to build such a language, I'd probably mine The Art of the Metaobject Protocol for ideas; some of its particular details require macros, but the overall approach is probably applicable.

It wouldn't solve my (admittedly artificial) problem here, though, since my environments are immutable.

Aren't immutability and redefinition fundamentally inconsistent?

My suggestion is to not think in terms of the general properties of the language, but instead to start from the specific and concrete problem of the debugging style that you love. i.e. look at it from the other end. Maybe it has particular properties that you can accommodate within the language (e.g. no return value or side-effects, so it can be invisibly inserted/removed, without semantically altering the system, like instrumentation for profiling/debugging - but it depends on what you're doing when debugging).

You could look at what you have historically actually done when debugging, to determine the smallest requirements.

I also like Sameer's comment.

You're right

You're right, they are inconsistent; and, yes, designing with my debugging style in mind in the first place would have helped. I started on this language before I started at ITA and discovered the Lispy debugging approach.

What it boils down to is this: I tried designing a hybrid of Lisp and Erlang, in the hopes of combining their strengths; but the result threw out most of the strengths of Lisp. I was left with an S-expression Erlang (subset), with macros and implicit parallelism; that's not enough to be interesting.

Check out how Clojure did it

John,
Clojure has an interesting approach to this problem via so called "Vars". It doesn't solve the impossible, but solves the practical side of things brilliantly, IMHO. I tried to explain it here in a few sentences, but it became too long, and I did't have time to make it shorter. You can check out http://clojure.org/vars instead.

Ooh

Ah, clever. Thanks!

Tricky

On reflection, using it in my language would require a good deal more intelligence from the runtime (and the compiler, when I have one; at present, I just have an AST interpreter). The problem is implicit parallelism: if a single process can be doing two things simultaneously, then per-process bindings aren't enough. Some sort of static analysis would be needed to determine when implicit parallelism is safe.

Cool

Can you tell something about your AST interpreter. Did you derive it from other work? How fast is it?

I wrote a small trampolining eager interpreter for a lambda+ subset. I performs around 3-4 million rewrites on a 1 GHz machine (about 200 instruction per rewrite). Not really fast, but fast enough for the moment. [BTW: pattern matching is done by the interpreter and almost every rewrite introduces a closure, therefor the high count.]

Quite slow, actually

Let's see. I don't think there's anything all that original in it; it's my first serious effort. It's pretty slow, but I never did measure it much.

I wrote it from scratch, in C++; values in the heap are of the class Pointer, which is a 32- or 64-bit value (same size as a void*) with type tags. The GC is pure reference counting (I planned to add something better later, of course). When a Pointer actually points to a Value (instead of being, say, an integer), its constructor and destructor call Value's methods to update the reference counts (atomically); when the count drops to 0, the Value gets deleted. So a Pointer in a C++ variable automagically keeps its referent live, without needing any code to inspect the stack. My hope was that this would be pretty fast (or, at least, not have a lot of pauses), since deletions would happen in small bits as Pointers went out of scope. In practice, I ran into the problem where, as soon as a Pointer to a large data structure went out of scope, each of its elements was freed, one at a time, recursively. (This was exacerbated, in my unit tests, by the fact that I was constructing a fresh global environment for each test.)

The interpreter was built in continuation-passing style. A function is essentially a continuation constructor: call it with an environment, and a next continuation, and get back a continuation. When a continuation is called, it returns zero or more continuations, which go into a queue. (In the normal case, where a function is returning a single value, its continuation curries that value onto its next continuation.) So then the main interpreter loop just does while true {extract C from Q; for C' in C() {Q.insert(C')}}. The way to take advantage of N cores is to have N threads running that loop (with refinements for performance, of course).

The CPS interpreter gives TCO for free.

The language supports multiple return values, almost as a side effect of the continuation-passing style: if you want to return 3 values, just pass three arguments to your next continuation. The syntax is more unified than in Common Lisp: (let ((x 7) ((a b) (values 1 2))) (list x a b)) returns (7 1 2). As in Common Lisp, excess return values are discarded.

The language has implicit parallelism: function arguments are evaluated in parallel. (a b c) returns three continuations to evaluate a, b, and c; when the last of them returns, it passes the resulting arguments to the continuation for a.

The language also has explicit parallelism, in the Erlang style: (spawn f) creates a process running f; (send p m) sends message m to process p; (receive) returns two values, a message and the process that sent it. (I don't like the way the receiver in Erlang can't tell who sent a message; it assumes too much about the trustworthiness of other processes.)

I/O streams are processes: the (read) and (write) functions work by sending messages to the stream process. This is meant to make it simple to implement stream wrappers in the language, since most wrappers need some sort of mutable state.

I tried to minimize the amount of functionality that had to be implemented in C++. Even the parser is implemented in the language. I sidestepped the bootstrap problem by writing the parser in Scheme syntax, and writing a bit of Scheme code to read the parser code and generate C++ to build an AST. (I could've written that C++ by hand, but it would've been nasty.) Then the parser is used to load the rest of the standard library.

Hmm. Getting all this written down has gotten me interested in it again. Maybe I will go back to working on it, even if I can't do redefinition; there's still good stuff for me to learn from it.

Seems good

Why not go for the Boehm-Demers-Weiser as a garbage collector? I am absolutely sure (ok 99.99%) it is a lot faster than any explicit reference counting collector you could write in C. Your implementation actually becomes simpler that way. I am not sure how it handles parallel garbage collection though, I think it just stops all processes so you might pay for that.

I like the parallel evaluation on multiprocessors. (Although experience from Haskell/Clean contradicts that it would actually be faster. But they didn't have multi-core machines, so it would be interesting to see whether it would work out on modern processors.)

As far as your debugging problem goes, well, I don't seem to get lisp inside my head, so it actually seems like a benefit to me. (And self modifying programs were abandoned 20 years ago as programming paradigm so there is nothing lost there, I think.)

Boehm GC

When I started on this, I was motivated by a desire to write a garbage collector; I just needed a language to make use of it. My first sketch included mark-and-sweep, and I was intending to go back and improve it; but I got more interested in the language. After a few sketches in different languages, I started over with a fresh C++ implementation. At that point, it might've made sense to adopt the Boehm GC; not sure why I didn't look into it.

<google, google>...but you're right; it turns out that the Boehm GC doesn't do parallelism. There is an extension that does, developed by Kenjiro Taura, Toshio Endo, and Akinori Yonezawa; but it hasn't been updated since 2003, at which point it was labeled version 0.4.beta1. Maybe if I were more experienced with GC, I'd be more comfortable using a betabeta; but, if I were more experienced with GC, I'd be more comfortable writing my own. ;-)

Well...

Since you compile yourself you determine how the C++ objects you create are lay out in memory (right?). It should be possible to create a simple stop and copy garbage collector, or maybe a conservative collector, if you just make sure that all fields in an object are at least of pointer-size size. Or you can tag the pointers (or non-pointers yourself) before storing them in the object. Both ways your garbage collector doesn't need to know exactly what C++ object it is collecting. You are still stuck with synchronization though.

Probably better actually is to use Boehm, but wrap your own allocation strategy around it (build some pools for each process) and then manually synchronize and invoke the collector (when one of the pools runs out space). You could keep the layouting trick though, it would actually help the Boehm collector differentiate between pointers and other stuff.

Or just wait until a really parallel Boehm is out. It really is hard to beat it in performance. (Actually, I don't get it isn't parallel yet, what collector do they have in gcj then? That runs on multicore machines, doesn't it? Maybe your problem is solved already?)

[I should add: just my $.02 thoughts on it ;-)]

[Another thought, there are lots of parallel generational collectors out there in the wild, can't you just download one?]

[Or can't you keep all heaps separate? This is the way I want to go. Parallel GC is all nice and dandy, but for most GC algorithms you'r kind-of always stuck with global GC sweeps, and I don't want one O(n*k) sweep, but k * O(n) sweeps, for k processes, since that gives more deterministic and timely responses.]

Good ideas

Since you compile yourself you determine how the C++ objects you create are lay out in memory (right?). It should be possible to create a simple stop and copy garbage collector,

Oh, yeah, I can do it—in my first sketch of the language, I had a mark-and-sweep collector. I just haven't done it yet.

Actually, I don't get it isn't parallel yet, what collector do they have in gcj then? That runs on multicore machines, doesn't it?

gcj does use the Boehm collector, but, no, it doesn't take full advantage of multicore. From Boehm's page:

In its default configuration, the Boehm-Demers-Weiser garbage collector is not thread-safe. It can be made thread-safe for a number of environments by building the collector with the appropriate -DXXX-THREADS compilation flag. This has primarily two effects:

  1. It causes the garbage collector to stop all other threads when it needs to see a consistent memory state.
  2. It causes the collector to acquire a lock around essentially all allocation and garbage collection activity.

Since a single lock is used for all allocation-related activity, only one thread can be allocating or collecting at one point. This inherently limits performance of multi-threaded applications on multiprocessors.

So, yes, gcj runs on multicore machines, in the sense that it's MP-safe; but it can't make much progress while GC is running.

Another thought, there are lots of parallel generational collectors out there in the wild, can't you just download one?

Probably, yeah. It'd have to support both x86 and ARM, though, because I do a lot of my development on my Nokia N810 (gotta make some use of that 100 minutes/day on the train). The approach I've taken so far has the advantage that the garbage collector doesn't have to grovel around in the stack.

Or can't you keep all heaps separate?

Not really, because processes are not the only parallelism (I evaluate function arguments in parallel). Besides, my processes share data (they just can't write to it), so keeping the heaps separate would require copying.

Separate heaps might help somewhat, though, since most of the data would either never be visible to more than one process, or would never become garbage (e.g., the contents of the global lexical environment).

Lock Free Garbage Collection

I read Lock-free Parallel Garbage Collection by Mark and Sweep, Gao, H, Groote, J.F., Hesselink, W.H. a few years ago; it's nice that it is formally verified, it's pretty easy to make mistakes in simple GC's at it is. I am not sure you can download their collectors though.

Two comments: In Erlang we

Two comments:

In Erlang we originally had the pid of sending process always occur when receiving a message but found we didn't really need it. It was so common to just not use it we removed it. Also it is quite common that the process you which replies to you is not the same one as you sen the message to. This becomes weird if you have the sender explicit.

Not being able to select which message you wish to receive becomes quite unmanageable in real life, you quickly get a combinatorial explosion of states. The alternative of carrying around the messages in which you are not interested which you happen to receive for later processing is also quite messy. From your example this is what I assume you meant.

Lisp multiple values make it easier

In Erlang we originally had the pid of sending process always occur when receiving a message but found we didn't really need it. It was so common to just not use it we removed it

Sure, but that's not as much as an issue in a language with Lisp-style multiple return values, where values you don't ask for get ignored. Sample code:

(let ((msg (receive))) ...)

(let (((msg sender) (receive))) ...)

In the first line, the sender PID from (receive) is just dropped on the floor. So including the sender PID doesn't incur any convenience cost to the user. (There could be some runtime overhead, but a compiler could take care of that by noticing how (receive) is being used.)

Also it is quite common that the process you which replies to you is not the same one as you sen the message to.

Well, I can appreciate that; but I'm leery of making that the only way to build a system. I suspect my assumptions are different because most of my concurrent work has been over open networks, where you really want to know who's talking to you. I can't stand network APIs that have recv() but not recvfrom().

Not being able to select which message you wish to receive becomes quite unmanageable in real life,

Oh, absolutely. My (receive) has some of that (and I intend to add more), but I elided it here. So far, you can key on the sending PID and/or on the car of the message.

I have been doing the opposite

It is interesting but I have been working in the opposite direction. I have started from Erlang and its VM and built a lisp which has all the features of Erlang and is completely integrated with Erlang. It is called Lisp Flavoured Erlang (LFE).

It is neither Scheme nor CL as these cannot be implemented on the Erlang VM and it has Erlang properties which are not common in more "normal" lisps, for example you can have functions which have the same name but take a different number of arguments. LFE is definitely a Lisp-2 language, because Erlang is, and CL oriented. It has a reasonable macro facility.

For an old lisp hacker like myself it is very satisfying to write lisp Erlang and still get all the benefits of Erlang and the Erlang way.

Nice!

Interesting! So are you translating LFE to Erlang?

I thought about building something in/on Erlang; but I felt like building something from scratch, as a learning exercise.

Yes, my goal was to build a

Yes, my goal was to build a lisp on the Erlang VM. I wanted something that would work together with all the existing Erlang libraries and code, something that would have the benefit of both worlds. And it was fun to do. :-)