## IEEE Scheme expiring soon

Scheme has an IEEE standard, IEEE 1178-1990, which describes a version of the language slightly later than R4RS (#f and the empty list are definitely different) but not yet R5RS (no syntax-rules, dynamic-wind, or multiple values). That standard was reaffirmed unchanged in 2008, and will come up again for renewal in 2018. What's the Right Thing?

I see three reasonable courses of action:

1) Do the work to make R7RS-small the new edition of IEEE Scheme.

2) Do the (lesser amount of) work to reaffirm IEEE Scheme unchanged for the second time.

3) Do nothing and allow the IEEE standard to expire.

Does anyone still care about having an official, citable standard for Scheme?

(When I brought the question up on #scheme, someone asked what R7RS-small implementations exist. Currently there are Chibi, Chicken (partial), Foment, Gauche, Guile (partial), Husk, Kawa, Larceny, Mosh (partial), Picrin, Sagittarius.)

## Comment viewing options

### Well, I'm not going to do it this time...

I'm no longer a regular user of Scheme, I do not use the recent versions of it at all, and I'm no longer a member of IEEE. So I'm not going to spearhead the standard renewal process with IEEE again.

I also feel that recent language design choices have made the language less worthwhile to use, and wouldn't like the job of defending them to a committee.

Sadly, R4 may be the most recent step in the right direction that the language design took, and positive progress has ceased.

In particular, and starting with the worst misstep:

The handling of Unicode as specified since R6 is just plain wrong.

It makes people have to care how strings are represented and preserves a "character" data type which is representation-dependent and not a tenable primitive for expressing strings given Unicode string equality semantics. It breaks homoiconicity to have different rules for symbols and strings. And if you really need an array of codepoints, then it can only be because you are using it for something that is not a semantically sensible operation for strings. People want to treat strings as arrays of characters but in order to do that they have to have a "character" unit which no longer maps to anything human beings think of as characters and also randomly assigns different lengths to the same string when it happens to be expressed using different sequences of codepoints. Transformations that "patch up" these inconsistencies usually do so in ways that create further inconsistencies; they make it possible for example, to split strings and then concatenate the divided parts and get a different string than you started with.

It is an error to conflate arrays and strings given that string equality as understood by humans and as specified in Unicode is no longer dependent on the equality of array elements, and that string length as understood by humans and specified by Unicode no longer matches the number of array elements. Furthermore, manipulation of arrays of the codepoints which the standard is now mistaking for characters, requires Unicode standardized operations providing semantic operations on strings, to be re-implemented in scheme. Divisions on codepoint as opposed to character boundaries can create characters that were not there before and can create malformed strings that start with non-characters such as bare accents. Mistakes can easily result in other malformed sequences which are not strings, in strings which contain things that are not and never have been characters, and treatment of strings as a non-linguistic array of data encourages non-string misuse of the structure.

The only principled and consistent thing to do given the Unicode standard was for strings to be immutable values, for characters and strings to be the same type, and for characters to be distinguished only by there being no semantically consistent (ie, without creating a malformed string) way to subdivide them.

Something which is less bad, but still wrong:
Exceptions are a serious impedance mismatch for continuation-based control. And guard forms as specified by winding continuations since R5 are not compatible with exceptions at all. If you want exceptions, you need a multithreading mechanism that they do not conflict with. If you want continuation-based control you need continuation arguments to call in case of whatever event. If you want these to be dynamically scoped rather than explicit arguments, which most people who want continuations desire, then you need to nail down the semantics of a dynamically scoped environment for them to be variables within. Scheme already has dynamic scoping for certain constructs such as current-output-port and so on, and ought to have either scraped these out for consistency with lexical scoping, or nailed down semantics for its dynamic scoping, long ago.

Moving on to something not quite as bad but still wrong:
The handling of uniform numeric vectors as specified in R7 is also just plain wrong, for similar reasons to the reasons why strings are now wrong. It invites people to write code which relies for its correctness on limitations of the underlying data representation, and in which the same constructs have different semantics when applied to different objects which are nominally of the same type.

Finally, something that is a damn shame although not quite completely wrong:
Multiple returns are a good idea but the multi-argument continuations specified in R5 and subsequent standards make them hard to use, ridiculous, and confusing. They needed a proper syntax that allowed binding the return values to variable names at the call site and within a definite scope, in a way consistent with the way lambda expressions bind values to variable names at the definition site and within a definite scope.

In general, I consider many major design decisions since R4 to be mistakes. Most new constructions are not sufficiently well designed to be consistent. In many cases elements have been introduced which conflict with each other or with existing design elements. No one is attempting to preserve the semantic simplicity which derived from being able to think in terms of semantic types with uniform operations rather than in terms of machine representations with many different semantics for similar operations. That semantic complication is a rich source of program errors and lessens the utility of the language as a means of expressing, explaining, or reasoning about algorithms.

R4 was a limited language, but at least the parts it had were parts that actually fit together. What we have now, isn't worth the effort to make a standard for.

### I'm no longer a regular user

I'm no longer a regular user of Scheme, I do not use the recent versions of it at all, and I'm no longer a member of IEEE. So I'm not going to spearhead the standard renewal process with IEEE again.

I didn't expect it.

R4 may be the most recent step in the right direction that the language design took

You don't mention syntax-rules. Do you consider that a mistake as well?

The only principled and consistent thing to do given the Unicode standard was for strings to be immutable values, for characters and strings to be the same type, and for characters to be distinguished only by there being no semantically consistent (ie, without creating a malformed string) way to subdivide them.

I agree 110%. Unfortunately, it was simply not within the remit of the Working Group to make backward incompatible changes with IEEE 1178. Our charter set an unreasonably high bar against that. I'd love to see R8RS take exactly this view.

I would add one further point and one caveat. The use of non-negative integers to index into strings is a mistake: string cursors should be opaque objects. SRFI 130 (a rewrite of Olin's SRFI 13 string library) takes this point of view. The caveat is that you assume that there is a single point of view about what constitutes a character to a human being, but that turns out not to be the case: people can and do disagree about how to divide the same text into characters. That being so, the codepoint view provides a rock-bottom mechanism that other more sophisticated views can be layered on in order to serve different purposes.

Scheme already has dynamic scoping for certain constructs such as current-output-port and so on, and ought to have either scraped these out for consistency with lexical scoping, or nailed down semantics for its dynamic scoping, long ago.

Formal semantics is all Greek to me, so I can't defend this point, but I've been told that the R7RS formal semantics does nail this down.

The handling of uniform numeric vectors as specified in R7 is also just plain wrong

I don't understand your reasoning here. It's true that you can't do anything with bytevectors that cannot already be done with vectors, but you cannot do anything with vectors that cannot be done with lists. Indeed, a vector can be abstractly considered as a list of pairs whose cars are mutable and whose cdrs are immutable.

Multiple returns are a good idea but [...] needed a proper syntax that allowed binding the return values to variable names at the call site and within a definite scope, in a way consistent with the way lambda expressions bind values to variable names at the definition site and within a definite scope.

The R7RS syntax let-values, let*-values, and define-values serve this purpose, if I understand your objection correctly. The first two come from SRFI 11 and are present in R6RS as well.

### quote

"You don't mention syntax-rules. Do you consider that a mistake as well?"

I do.

I'd be much happier with a more powerful system that wasn't a dsl.

I made comments about it here and here

### syntax-rules

I'm conflicted about syntax-rules. On the one hand, it enables some things that truly needed to be enabled - hygiene is a somewhat serious restriction when a language has no access to any dynamically scoped structures. As reflected by the fact that more than half of Scheme's native syntax forms deal with its special predefined set of dynamic variables.

On the other hand, what was needed to serve that role was a proper semantics for dynamically scoped variables that didn't allow them to be confused with or shadowed by lexically scoped variables. That would have made all sensible usage of syntax-rules unnecessary, and the particular semantics they chose is unclear, inconsistent with the rest of the language, and can be used to provoke completely unexpected behavior such as hiding a function evaluation within the evaluation of a symbol which appears to be merely a variable reference.

So I'm categorizing syntax-rules under "the surgeon really did need a cutting instrument; a scalpel probably would have been more appropriate than a chain saw."

### Could you help me?

Can you give an example of a system of macros with dynamically scoped variables that solves the hygiene problem? I say that because I really don't understand and I want to since I'm about to be implementing a macro system.

### That IMO much-overquoted

That IMO much-overquoted remark has never applied to library features. If it did, R2RS (the first Scheme report to have a procedure library at all) would have had only car, cdr, cons, pair?, and null?, since lists and atoms (defined by exclusion) are a sufficient basis for programming anything, and there would have been no need for R[3-7]RS at all. (It actually suffices to allow just one atom.)

As you know, I admire Kernel very much, but it is not Scheme, and my purpose at present is determine how to standardize Scheme.

### Unlibrary

Doesn't apply to library features; quite right. Though none of those three issues is about library features. (Admittedly apply-continuation is library, but I merely used it as a more direct illustration of my point than the primitive continuation->applicative under which the relevant rationale discussion occurs.) Also granted, Kernel is a bit too far afield in design space to qualify as a Scheme; but I do maintain that those three features added to R5RS Scheme are philosophically inappropriate for Scheme because they introduce lack-of-generality into the language whereas there are ways to accomplish such things that are philosophically consistent as the R5RS solutions are not.

### Lack of generality has been there from the beginning

Scheme has always had second-class mutable variables rather than first-class boxes (cells). It's somewhat mysterious why this is so. Steele remarks somewhere that Scheme doesn't have cells because it would have taken an extra year to make it have them, and I always wondered what he meant by it. Certainly boxes (in the form of hunks) were available on Maclisp, and it would have simplified the Rabbit compiler considerably if all variables were immutable. I finally realized that he meant it would have taken too long to figure out how to optimize away the use of explicit boxes in cases where they were not necessary.

As dynamic languages go, Scheme is actually rather static in everything but its type system. For example, the effects of redefining a name in the global scope are undefined. Procedures are not S-expressions, though their external representations are.

### huh?

I find this confusing.

When you say variables are second class instead of boxes I understand that as "you can't take a reference to a variable".

But then you say you want them to be immutable.

What's the point of immutable boxes? If they're immutable you might as well pass values.

A lot of languages can't pass by reference. It turns out not be a big deal, since you can box by hand in the very few cases that you need it.

### Assignment conversion

Assignment conversion changes variables that are actually mutated (lexical scoping makes it possible to determine these for sure) into immutable variables holding mutable boxes, thus putting all mutation into data structures rather than the language itself. Some Scheme compilers in fact do this, on the assumption that mutating variables are rare and a few such boxes scattered through the heap is no big deal.

However, if you were to get rid of set! and simply require users to use the boxes themselves, as ML does, it becomes much harder to optimize away boxes when they turn out not to be necessary. It is precisely the advantage of making things second-class that they are often easier to optimize.

### Procedures are not

Procedures are not S-expressions, though their external representations are.

I find myself uncertain what you mean by this statement. It might be a statement that procedures are encapsulated atomic objects (which is also true of Kernel operatives, and has nothing to do with static-versus-dynamic); or a statement that the standard does not require procedure objects to be evaluable (which would imply that they aren't first-class, and would honestly surprise me); or some other statement that I'm failing to imagine as a possibility.

### I meant the former

Procedures as encapsulated objects were introduced into the Lisp tradition by Scheme. Before that, and right up to CLtL1 and Elisp, lists of the appropriate form are procedures, though perhaps not the only type of procedure. You could cons up a list whose car was lambda and pass it directly to apply. That's not possible (portably) in CLtL2 or ANSI Common Lisp, and has never been possible (portably) in Scheme. All code objects are knowable in advance (hence static), and the only thing that can be freely created at runtime is a closure with dynamically determined values of the closed-over variables and a known code object.

The exception, of course, is a procedure created by eval. But eval did not exist in Scheme until R5RS, evaluates its first argument in an empty lexical scope, and there is no guarantee that the global scope in which the call to eval is executed is reified in such a way that it can be passed to eval at all. In principle, eval could invoke an entirely separate Scheme implementation that shares nothing with the calling program.

### Procedures as encapsulated

Procedures as encapsulated objects were introduced into the Lisp tradition by Scheme.

The Art of the Interpreter views the pre-Scheme strategy as lacking first-class procedures, rather supporting only first-class representations of procedures. These are of course just different ways of saying the same thing, but I think I agree with Steele and Sussman that it's more useful to think of the earlier approach as not giving first-class status to procedure values. Procedures in this sense cannot be represented in source code — they are purely runtime entities, which is also why they are exempt from Wand's "theory of fexprs is trivial" effect (that is, Wand means by "theory" the theory of source expressions, and Lisp has a trivial theory of source expressions because all Lisp source expressions are passive data; procedures, being not source expressions, do not appear at all in Wand's trivial theory). Encapsulation of procedures is important for proving things about programs, but this is true regardless of whether the proofs apply to compile-time or runtime.

R5RS provides what I'd call a fake eval, since what makes eval a meaningful operation is first-class environments.

### "Fake" is a rather harsh term

See my comment on conlanging and fossil faking to your most recent blog post. "Disjoint" might be less contentious.

### It's surprisingly useless

compared with say, an eval in the current environment, if safer.

### Current environment

Mm. Evaluating in the current environment can be dangerous too if it's easy to overlook the dependency, which is why Kernel's eval always requires an explicitly specified environment argument. ("Dangerous things should be difficult to do by accident. :-) Personally I think eval really only comes into its own when it synergizes with lexical procedures capable of optionally capturing their dynamic environments.

### word choice

Yes, "fake" is a rather harsh word choice. But I think it applies: not merely "not real" (similarly to the way conlangs are not real), but not real while misleadingly claiming to be real. It doesn't require deliberate deception, although to produce such a thing without deliberate deception would seem to involve some failure to recognize the lack of efficacy in what was adopted. Possibly the puzzling outwardly visible result of some compromise with a convoluted internal story.

### re fake eval ("word choice")

See the other comment some thoughts on steering scheme.

Fake eval arises from a general pattern of shunning any dynamic, reflective features that are simple and easy in an on-the-fly graph-based interpreter, but that can be used in ways that thwart optimizing compilation.

In the early history of Scheme it was widely recognized that Scheme made sense and had a potentially bright future in both dynamic and static forms, and that the two forms could be harmonized. (Aubrey Jaffer's SCM and his original CHEAPY compiler (later replaced by Hobbit) sketch and provide evidence of this understanding.

Around the time of R4RS and then increasingly after, the authorities in control of the reports became hostile to the dynamic path. Leadership was dominated by the opinions of a few developers of academic research compilers.

### Obsolete dichotomy

Now that we have just in time, and even tracing compilers (even in some scheme implementations like Racket) the research should be into combining the two.

Racket still shows problems from that fight though.

### re obsolete dichotomy

Now that we have just in time, and even tracing compilers (even in some scheme implementations like Racket) the research should be into combining the two.

Uh..no, that's several confusions combined.

1. JIT was well known and convincingly practiced no later than 1987, published by 1990 (Self). R4RS arrived in 1991.

2. Really, JIT in the context of graph-based lisp interpreters was understood no later than 1990 (SCM).

3. The dynamic, reflective features under consideration here are interesting in no small part because they can be usefully applied in ways that defy even JIT compilation.

The attitude that an interpreter has to be justified by an imagined potential to use any technology, even JIT, to compete with optimizing compilers of static code -- that attitude -- is what help the Scheme steerers drive away implementers like Jaffer (and me) and helped them lead the Scheme reports in the direction of being about the tastes of a very small number of compiler maintainers.

They treated Scheme standards as if they had to be protected like Cobol or Fortran standards; as if there were millions and millions of users more dependent on stability than advance.

Like Jefferson Airplane said: "Soon, you'll achieve the stability you strive for / in the only way it's granted / in a place someone puts fossils of our time."

### re cheapy compiler

Then I am using the wrong name but the thing I'm talking about was Jaffer's not Steele.

Jaffer had a little hack that would compile a tiny ad hoc subset of Scheme to simple-minded C to compile together with SCM. When Hobbit came along, it was a greatly cleaned up idea of the same concept.

I had the impression Jaffer's hack was for the purpose of some specific "day jobs" he was up to.

### "Fake" is exactly the right term

Without first-class mutable environments there is no way to use eval to accomplish anything that would justify its use in the first place. In pursuit of static optimizability, "eval" was effectively sacrificed; the only reason it remained in the language was to pay lip service to the tradition in which it had actually been useful.

It is well understood that 'eval' is hostile to static compilation, and that if you use it you will pay a heavy price. But that was not an adequate reason to blunt it into oblivion.

### Well Understood

It is well understood that 'eval' is hostile to static compilation

Right. That's the problem I am running into. I am trying to make, not a Lisp, but a compiler for an untyped combinator language (with quotation) where most of the compiler internals are exposed to language such that it should be feasible to implement DSLs more easily.

But yeah. The problem sucks, I am not sure where to progress, and I am not sure I would end up with something worthwhile.

With Lisp, most of the low-hanging fruit regarding dynamic compilation seems to be eaten.

### Module Loading

Isn't the problem with "eval" equivalent to module loading? You could compile the code in the eval to a shared object with unknown symbol bindings, and then let the dynamic linker resolve the environment. You would effectively build the code inside the eval as a .so or DLL and then pass the environment as a native symbol table to the dynamic library loader. That's probably as good as you can get statically, and would allow all static optimisations except inlining, which based on the fact you want to dynamically change symbol bindings is the best you could ever get statically.

### Dunno

I am not exactly sure what you are proposing but it doesn't feel like the right direction.

I have dumbed it down to two competing requirements:

1. I want to be able to write arbitrary functions over the AST resulting in new ASTs directly in the language. (If I could get rid of quotation that would be even better, I guess.)

2. It should be efficient. (And it should work in a REPL, of course.)

So, something like it should be able to pick up an expression, compile it, run it, and then continue with the next expression?

### Static or JIT?

Isn't the aim to statically compile what is inside the eval, deferring calls mapped to the environment (and values by reference so they are mutable) until runtime? This is exactly what the dynamic link loader does when loading shared objects. It sounds like this is a valid solution to me, is it just that you don't like it, or want to do something more just-in-time?

The thing is I don't see any difficulty with JIT compiling the eval contents? You might want an intermediate byte code representation though, so you are not string parsing at runtime, but this seems like normal JIT compiling, so I thought you must want something else?

### I thought the problem was

what's in the eval might be
(set! TopLevelFunctionThatsCalledEverywhere (lambda() (print "oh my god I changed the program")))

Either forcing the compiler to never inline functions or forcing it to pause and deoptimize running code (like Self used to).

### Don't do that :-)

As I commented elsewhere I don't think changing bindings for functions that are currently on the call stack is a good idea. What I wanted to allow was changing the bindings for the code in the eval, so you could call code with a different environment.

If you can pull the rug from code already in the call stack you would have to JIT compile, and set watches on symbols used in the call stack to invalidate the compile cache for that function, or indirect every call through a function table, still that would be no worse than virtual functions for performance.

### You misunderstood, I think

The problem isn't TopLevelFunctionThatsCalledEverywhere is currently running. No, not at all.

The problem is that functions might have inlined TopLevelFunctionThatsCalledEverywhere, and that inline is now an invalid optimization. Those functions all have to be recompiled based on their optimization being wrong:
1) And some of THOSE functions might be running.

2) Worse than anything, imagine a loop that calls TopLevelFunctionThatsCalledEverywhere
a) and TopLevelFunctionThatsCalledEverywhere is inlined
b) and in this case it IS running
c) there is no call stack for it
d) you have to deoptimize the running program, CHANGE the call stack to simulate the function being on the call stack, deal with the impossible problem of how that function folded with outer code (I bet Self simply disabled most optimizations in the first place). And set it up so that the NEXT iteration gets the new function.

Think of how hard that is. E

### I don't see it's that hard.

A compiler is merely an optimisation, so "all" you need to do is preserve the behaviour you'd get if it was interpreted.

All you're changing is a binding; anything that already *has* that binding gets to keep it "as is", anything that uses the binding after it's been rebound gets the new version. Sure, dealing with inlined code is a bit tricksy, but I can think of a few ways of doing that which should only affect performance in the case where you *have* rebound stuff, and anyway, optimisation is hard.

The only hard stuff is if you actually *want* to rebind everything up the call stack. Perhaps I lack imagination, but I can't actually think of any case where you'd really want to do that.

### That's because Keaan focused on Symbol Bindings

The problem I am looking at is more general. One of the problems I encountered, for instance, is: If you quote something with the explicit intention of doing a source-to-source translation and compile it then how often are you going to compile? Once, for every compile instruction, or every time you encounter it in the body of an expression? Moreover, do you want to quote the source code or a runtime expression?

If you go the way of code-is-data there are lots of design decisions you can make beyond adopting the manner in which Lisp does it.

### Yep.

I saw this from time to time in scheme code. I think it came up most egregiously, in fact, in most of the early "records" implementations.

Somebody would want a user-definable type such as records, and they would implement it in terms of vectors, and then they would redefine vector-ref and vector-set so that those functions did not work on their brand new records.

And then of course that code would have a big ol' fight with someone else's code who had had the same idea, or with someone else's code whose continuations didn't catch at the point they were relying on having caught, or with someone else's winding continuations that were trying to protect references to particular other vectors with guard clauses, or ....

And that was about the point at which my hobby lisp started acquiring sigils that limited certain kinds of dynamism.

### JIT

For one, I want to be able to have access to all ASTs. So, while compiling a quoted expression, where a symbol refers to a defined object, I want to be able to substitute its definition.

A bit uncommon but should allow one to build, for instance, computer algebra systems. (Although Mathematica would arguably be better at that.)

I am not sure about it. Weighing the pro's and cons' of various approaches.

### Code as data

That doesn't seem that odd, although different from what I was thinking. You just need a data structure that can represent code. This would not be a byte code but as you say an AST, with a binary in memory representation. It doesn't seem that different from an ordinary algebraic datatype. You would have a front end parser that converts everything to the AST format, then you could self modify the code. You would then JIT compile and invalidate the JIT cache if a functions source AST is modified.

### Hard Problem

Well. it turns out to be a hard problem (to get right.)

### Defining a real eval in terms of that

R5RS provides what I'd call a fake eval, since what makes eval a meaningful operation is first-class environments.

As long as the first-class environments can be iterated over, those are interdefinable. That said, I see R5RS and Kernel don't allow environments to be iterated over, so that might be a moot point.

I'll describe what I'm talking about anyway. Suppose our environments were association lists. We would like to write this:

(eval-in-env '(/ (+ b c) a)
(list (cons '/ /)
(cons '+ +)
(cons 'a 1)
(cons 'b 2)
(cons 'c 3)))


We can achieve the same thing with this:

( (eval-in-empty-env
'(lambda (/ + a b c)
(/ (+ b c) a)))
/ + 1 2 3)


We can define eval-in-env as a procedure that does this.

If we don't like having the ability to iterate over environments (and I don't, necessarily), then I think we can proceed to write a library where these assoc lists are wrapped up in an encapsulated data type.

### Almost works in R7RS-small

Your first version can easily be written in R5RS or earlier, and in fact I intend to propose something related to it for R7RS-large in order to do partial evaluation, which is useful in connection with the possible introduction of second-class lexical syntax. More on that another day.

R5RS and R7RS-small do provide global environment objects. The trouble is that there is no way to create fresh objects. The only mutable environment object guaranteed to exist is the interaction environment, and there is no assurance about exactly what names it provides (informally: the names visible at the REPL). What is more, R5RS does not allow evaluating definitions in order to create new bindings, though R7RS does. I have a preliminary proposal to lift those restrictions for R7RS-large.

There is not, nor do I intend to propose, any machinery for first-class lexical environments.

### Opinions vary on some of those points.

I don't have a problem with multiple return values. In fact if you have both continuations, and functions that take multiple arguments, then they make the semantics more regular and consistent. I'd have expressed them differently but that's bikeshedding. I don't have a problem with them.

And lots of people think Scheme/Lisp macrology is natural and consistent; I think they are wrong, because these features create a staged runtime and its attendant set of complications, and the purposes they serve should probably have been served otherwise (as in Kernel).

Winding continuations on the other hand are in fact an ugly stain. They are a 'bandaid' on a very deep impedance mismatch that needed to be eliminated or resolved in a way that simplified things, rather than bandaged in a way that complicated them.

### In terms of convenience

and getting rid of unnecessary glue boilerplate, having multiple values the way lua does, where both calls AND returns can take different numbers IN THE SAME FUNCTION without an error and MISMATCH WITHOUT AN ERROR is much more convenient. It may be less safe, but getting rid of boilerplate is worth it.

### Scheme allows that

The Lua / Common Lisp rules (extra values are dropped, missing values are set to null) are valid in Scheme, though not a requirement, and a round dozen of Schemes actually apply them. However, not requiring them means that low-rent but conforming Schemes like Chibi can use very simple implementations, because there is no requirement to signal an error on mismatch. Details on multiple values are available.

### Lua

I've observed the potential neatness of that facet of Lua, and considered its implications for language design in general. Seems to me that while a language design may benefit from neat conceptual devices like this, the benefit is lost if there are too many of them not fundamentally connected to each other. JavaScript has lots of pieces with insufficient overall coherence, making it a tangled mess. Indeed, getting the whole small enough for serious synergy is quite difficult. I've obviously thought deeply about what the core concepts of Lisp are, and it seems to me multiple-value returns don't fit cleanly into the picture. What could be done to expand a coherent Lua-like approach work is a separate (and engrossing) question; indeed, one could say the same for any language with some merit to it. A side project I've been mulling over for many years is to try to isolate and expand the elegant core of vintage 1970s BASIC — it has to have had such a core or would never have been popular.

### In fact if you have both

In fact if you have both continuations, and functions that take multiple arguments, then [multiple return values] make the semantics more regular and consistent.

I submit this is an illusion caused by having got things subtly wrong in the first place. The irregularity is already present, and multiple return values try to generalize from it with inevitably unfortunate results; trying to build a higher and higher tower on a flawed foundation eventually comes to grief, hence the Smoothness Principle I've proposed re abstraction theory (any roughness in a language design ultimately bounds its radius of abstraction). Scheme doesn't allow you to write (apply (lambda x x) 2) even though common sense says this ought to evaluate to 2, because the language design fails to grant first-class status to the argument list passed to a procedure. Once you admit first-class argument lists, you can see that this entire structure is what ought to be passed to a continuation, so that instead of writing (c 2) one ought to write (apply c 2). The idea that there is something "more general" about passing multiple return values to a procedure is grounded in the illusion created by instantiating a continuation as a procedure that takes just one argument and discards the rest of its (second-class) argument list.

### Obvious?

It is not obvious to me that something which accepts a proper list as its argument structure should also accept improper lists, nor is it obvious what the utility of function calls of the form (f a . b) might be.

The goal is to provide future programmers with a uniform language design that empowers them to imagine things we haven't thought of, by leveraging the uniformity we've given them, and, having thought of those things, to write them and have them work as expected. We expect to be unable to name specifically the ways such features might be used down the road. The fact that (apply (lambda x x) 2) in Scheme doesn't do what we expect it to do is a problem in itself because it deviates from our understanding of the uniform language design; the unacceptability of the deviation does not depend on our ability to name a specific situation where we want that. (I do recall using an improper operand-list in my library implementation of $cond, but the general point stands without that.) ### Yes, I think that was in fact a mistake. But it's a small one. In building a lisp of my own I decided it would be better if cyclic or improper lists were among the parts of the data representation that do not eval to runnable code. Their use in lambda lists seems like a minor wart to me. I don't know if eliminating the improper-list lambda arguments would be a good choice for scheme; I certainly wouldn't be bringing it up in a scheme standardization effort because although I like the &listargs sigil better, the issue isn't all that important and most people think of the . as a sigil rather than list structure anyway. The fact that it's list structure created inconsistencies when I defined 'lambda' as a procedure rather than as syntax. And rather than allowing the inconsistencies to spread and start affecting other things, I used a sigil list element instead. I'm also using other sigils (like &lazy) in lambda lists so it seemed to be the most consistent approach from the user's POV as well. Scheme has committed to the improper-list formal argument structure for a very long time and, unlike a lot of more important things, does not seem to be causing a lot of wailing and gnashing of teeth. So I'd pass it by unless people were reporting it as a source of pain or limitation. ### argument case is correct; conclusion does not follow. You are absolutely right when you say it is a failure of 'smoothness' when scheme does not allow something like applying a lambda expression directly. But that happens because of the function/syntax roughness, not because of any arity roughness, so I don't see how it's an argument against having the same rules for return arity and argument arity. In terms of arity, I see smoothness in a lambda calculus where a function can only take a single argument and then return a single value. It's even smooth when the single argument is a list of values and the single return is also a list of values. Where the implementation is consistent about boxing and unboxing (ie, never passes or returns a non-list) the multi-argument case is semantically identical to the single-argument case. But passing a list of arguments one way and not passing a list of returns the other way breaks symmetry, especially where the language has continuations. For perfectly smooth semantics, functions ought to return the same way they're called - ie, the caller should not need to know whether this is a continuation or some other kind of function, should not have to check for and generate special-case code for that case, and should not need to know whether its own call stack is going to get garbage-collected after making the call. It just does what it does to call a function, and that's it. If you want multiple arguments and single returns, that's smooth in the *absence* of continuations. And continuations are a bit of a semantic minefield in the first place, and a simple call stack with tail recursion is very general. So smoothness achieved without continuations would not make a language at all crippled. ### I didn't give a complete I didn't give a complete argument against multiple-value returns; I touched lightly on a few parts of it. Continuations come into it because the way continuations are applied in Scheme encourages the misconception that leads to multiple-value returns; but even without continuations the misconception would still be possible. The basic notion behind multiple-value returns is that returning multiple values would be "more general than" returning just one. But if returning multiple values is more general than returning just one, wouldn't returning a first-class list of all the return values be "even more general than" multiple-value return? Of course it would, but doing so would seem silly because in that case why not just return the whole list as a single result. And indeed, it would be silly and one should just return the whole list as a single result. It should be clear that this is all just a question of how you want to arrange the syntax for returning multiple values — and that's a point on which Scheme is weak. Indeed, I seem to recall seeing the same weakness in Backus's "Can Programming Be Liberated from the von Neumann Style?": although in theory a computation that produces multiple results can be handled by a functional expression that evaluates to a tuple, in practice that only works if you have syntax for very conveniently taking the tuple apart once it's been returned. Kernel does have such syntax: generalized definiends. In Kernel when you have a procedure p that returns a list of four values, you can write ($define! (a b c d) (p ...))


and those four values will be bound to symbols a, b, c, d. Scheme can't do that because it's overloaded its define syntax with some unfortunate "syntactic sugar" for defining procedures. (From some things I've heard about introductory Scheme classes, that syntactic sugar also tends to sabotage students' understanding of the elegant concepts behind Lisp, by encouraging them to think of procedures as second-class language elements.)

So what we're really dealing with is a very deep design choice that might seem to be a "simple" question about the syntax of function value return, but really has sweeping consequences across the whole design. Making it difficult to give a compact explanation of the choice.

### Okay, so for backward compatibility ...

... with the admittedly lame syntax sugar for defining procedures, we have to name our general multiple-value operators somewhat noisily: define-values, let-values, and let*-values. They've been around for 15 years and are part of R7RS-small. That seems a minor thing to make a fuss about.

### Uniformity looks like a

Uniformity looks like a small thing if you look for one small place where it'll make a huge difference. But it's ubiquitous, and has these individually seemingly-not-earthshaking effects that add up until, in the long run, its cumulative effect overwhelms everything else. Even then one might fail to notice it, because the effect is then on such a big scale, like failing to notice that you're standing on a continent.

### I feel like you're overstating it.

If I were to complain about scheme I'd pick the lack of convenince:
1) arrays don't grow for you. Why? WTF why?
2) records have an interface that's so bad it looks like it's from the 50s
3) basic syntactic sugar are through things that once again are so crazy like make-set-tranformer or set-variable-transformer that I want to run screaming
4) the new macro system is horrible for everything except a few simple examples from someone's paper. Forget things being able to implement an object system with instance variables visible in methods.
5) despite protestations to the contrary syntactic tokens have hidden fields that you can't examine or set properly, such as the all important one that specifies what context a variable is in - at best you can just pick a token at random from the source and make a new token based on it and hope it's at the right level... Damn it's so bad.

I mean you're right about uniformity, but wrong that your example is a bad one

### Lack of convenience

1) Arrays don't grow for you because efficiency, and because you can easily layer growing arrays over non-growing ones.

2) In the 1950s records looked like this:

   01  MAILING-RECORD.
05  COMPANY-NAME            PIC X(30).
05  CONTACTS.
10  PRESIDENT.
15  LAST-NAME       PIC X(15).
15  FIRST-NAME      PIC X(8).
10  VP-MARKETING.
15  LAST-NAME       PIC X(15).
15  FIRST-NAME      PIC X(8).
10  ALTERNATE-CONTACT.
15  TITLE           PIC X(10).
15  LAST-NAME       PIC X(15).
15  FIRST-NAME      PIC X(8).
05  ADDRESS                 PIC X(15).
05  CITY                    PIC X(15).
05  STATE                   PIC XX.
05  ZIP                     PIC 9(5).


Not Scheme, I assure you.

3) That's Racket-specific.

4) Identifier-syntax macros can and do handle "instance variables visible in methods". I personally think that when you see a variable, a variable it should be and not a disguised method call, but the capability exists in R6RS.

5) That's syntax-case-specific. Despite R6RS, syntax-case is not and never has been the only low-level macro system.

### Yup.

It is exactly as you say. A single argument - which is a list. And a single return - which is also a list. Makes it exactly equal to the one-argument lambda calculus. Neither more nor less general. I wasn't arguing about generality, I was arguing about call/return semantics mismatch. Smoothness fails at the semantic level if there is a mismatch where one is *always* a list and the other might not be.

The syntactic failure here is only that Scheme doesn't make it as easy to accept multiple returns as it is to pass multiple arguments. For me that comes under "yes the syntax is a shame, but no the semantics aren't wrong."

### some thoughts on steering scheme

I suggest asking those who did it why they pursued IEEE standardization in the first place. If they are not teetotalers, perhaps get a few drinks into them first. Ask: Are those reasons relevant today? Were the broader goals of IEEE standardization actually achieved? In retrospect, does it seem worth it (a) personally, (b) for the future development of research, development, and use of scheme. Cui bono? Quid?

Scheme would be a different and, I think, more interesting language these days if Steele's advisors had looked at the Rabbit thesis and said "It's a bit thin. Add a second half about dynamically interpreted Scheme and reflective features made possible in an interpreter."

Instead, the authorities since r4rs have seemed hell bent on making sure that any such dynamic features are Not Scheme. At the same time, they keep raising the bar of what a complete implementation is supposed to comprise.

While that's what the authorities are up to, no small part of (what remains of) actual real world interest in Scheme seems to concern flyweight implementations and the use of Scheme for dynamically programming interactive environments. Go figure.

### One reason for pursuing

One reason for pursuing standardisation for Scheme (and for Common Lisp) back then was a fear that otherwise someone else would initiate a standardisation process and would consequently have more control over the resulting language definition.

Both Scheme and Common Lisp had an informal group working on what the language should be, and standardisation in effect took those groups and turned them into the technical committees for the standardisation efforts.

The danger wasn't only from potential standards for the very same language (Scheme or Common Lisp) but also from ones that would seem to cover those languages, so that it wouldn't make sense for them to have a separate standard. One of the reasons the ISO standard is for ISLisp, rather than Lisp, is that the Americans were adamant that there mustn't be a standard for all of Lisp -- indeed, McCarthy said he would denounce any such standard -- and that Lisp be treated as a family of languages which could have separate standards, rather than as one language.

Another reason for pursuing standardisation, at least for Common Lisp, was that some funding bodies preferred or required the use of applicable standards. That put languages that were not standardized (or, rather, people who wanted to use those languages) at a disadvantage.

### re: one reason

Another reason for pursuing standardisation, at least for Common Lisp, was that some funding bodies preferred or required the use of applicable standards. That put languages that were not standardized (or, rather, people who wanted to use those languages) at a disadvantage.

Sure, Common Lisp was clearly an attempt to draw the boundary lines between commodity forms to prevent lock-in to particular lisp vendors. It could be compared by analogy to the way POSIX was meant to resist lock-in to particular unix vendors.

The thing about Scheme is that it never actually encountered the threat of lock-in because no implementation of Scheme at any point in its entire history have ever had quite the economic importance of any of the commercial lisp machines (nevermind a unix). There's something cargo-cultish about standardizing it.

McCarthy said he would denounce any such standard -- and that Lisp be treated as a family of languages which could have separate standards, rather than as one language.

I had no idea. Smart guy.

### Well....

If the reason the last RnRS failed to adapt to the new realities of unicode was a desire to be consistent with the IEEE standard, then it is best to allow the IEEE standard to die, so that the mistake is not repeated.

However, I fear that having already standardized on the Wrong Thing, they will not correct their errors in that regard. Nor will they make a choice between flow-of-control paradigms; now that throw/catch and exceptions have been forced into the language, they are going to fight unto Scheme's death against continuations - particularly against winding continuations, which are a stain on the language in their own right.

### So...

if the RnRS branch has gone irretrievably wrong, would that then be a reason to ignore it and strive to get the IEEE branch right?

### Good luck with that

After doing all the work, you need to send out ballots to the electors, and get a 75% return on ballots and a 75% approval rate. "Politics is the art of the possible." (Peter Medawar)

### Figuring out what one wants

Figuring out what one wants to happen is prerequisite to any political effort, however imperfect, to make it happen. It's not possible even to approximate achievement of goals one hasn't identified.

### What do you mean, "die"?

Do you imagine that because the IEEE withdraws its official approval, the P1178 document (or R4RS+, whatever) ceases to exist? For every Schemer who wants Scheme to "adapt to the new realities of Unicode", there are at least two Schemers who think the ASCII repertoire (not even the ASCII encoding) is the perfect, jewel-like counterpart to small Scheme. Everything beyond ASCII is standardized but optional in R7RS, and that's what was needed in order to get to consensus. (The fact that ASCII, like Unicode, was a political compromise is simply forgotten.)

With a less conservative Steering Committee, I would have been happy to make R7RS a more radical break with the pre-Unicode past. But the community elected that Committee, and presumably got what it expected to get. There is no "gray They" here, just John Cowan and Alex Shinn and Art Gleckler and all the WG1 members and all the people who voted either for or against R7RS.

Ah well, "the part-time help of wits is no better than the full-time help of halfwits." (Wolcott Gibbs, I think)