Scheme to be split into two languages

According to a draft statement, Scheme is to be split into a small and a large language, the small being designed for educators and "50-page purists", the large for "programmers, implementors".

The small language should remain true to the language design precepts found in the RnRS introduction ("Programming languages should be designed not by piling feature on top of feature, …").

But what about the large one? Will it drop continuations, become a Lisp-2, and challenge Common Lisp?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

It's a brave attempt

I wish the committee all the best in this endeavour. I think it is a very brave attempt to simultaneously try and keep the Scheme community happy (not that this is possible) and drag standard Scheme towards being a useful programming language. (Many implementation of Scheme are useful; standard Scheme is not.)

Can't see it becoming Common Lisp though.

R6RS is the Current Standard

All Schemes were "useful" for something. But I suppose you mean useful for the about the same class of tasks as Common Lisp. Just a reminder: The current RnRS standard, R6RS, is quite large, too large for some, and both useful and practical for a large class of programming tasks. What features/libraries are missing in your opinion from R6RS Scheme which would be need for the Scheme standard to be useful?

Most of the SRFIs

Most of the SRFIs

Standard libraries?

Would someone who knows more about what's involved in the split mind commenting on why this can't be done with a simple core and a number of standard libraries on top of that core?

re standard libraries

I don't speak for the committee but I believe I can answer. The "large language", should it emerge, will undoubtedly be a small core with libraries. The small language will likely be like the small core, but with weaker semantics. For example, the core of the large language might reasonably require the character type to include, at least, Unicode. The small language should not preclude support for all of Unicode; neither should it require it.

For example, suppose that I want to (say, in a textbook, or just as a code snippet to hand around) show a canonical implementation of quicksort that takes a comparison predicate as a parameter. I can do that in "small Scheme" and my code works in both languages.

On the other hand, suppose I want to show a canonical comparison predicate that captures Unicode collation rules. That code will (if these languages actually happen) be portable among large Scheme implementations but not necessarily among small Scheme implementations.

-t

It is

I don't speak for the group, but the old R6RS-discuss conversations, and the charter, seem to read that way.

Working Group 2

The hardest part will be trying to find a name for their new language. I wish them luck.
I hear Scheme is taken, so is Python. Maybe they should standardize Python instead, and leave the rest of us in peace. It was always my opinion, that if you wanted, you could easily write a scheme interpreter, and go off and do whatever you like. I don't see a lot changing here, other than we will have yet another section in the O'Reilly catalog to fill the book shelves, all graced with titles like "Language Name TBD: Cookbook".
Working Group 2, I wish you the best of luck with your new language!

Trying to find a name ...

Given the scheme community's penchant for tossing R's around (RnRS), schreme (pronounced "scream") is one obvious choice. The only problem is, which language gets that new name, and which one gets to keep "scheme"? (Or "schism" as Ray suggested, but ... same problem.)

Not Two Separate Languages

The proposal in the *draft* statement is NOT to develop two separate Scheme languages, but a small and large version of the same language, where the small version is a strict subset of the large one. It is not quite clear what is meant by "subset", but I take this to mean that any valid program in the small language is also a valid program in the large language.

I agree with Matt that it would be better to frame this as a distinction between a small core language and a large set of standard libraries, rather than as two languages. As this discussion already show, speaking of two languages suggests splitting the community, whereas as the goal is surely to bring the community back together, after the supposed split caused by R6RS.

I thought it was very clear.

From Coordination with Working Group 1:

The programming languages developed by working groups 1 and 2 must be consistent, and the programming language developed by working group 1 should be a proper subset of the language developed by working group 2.

So far as is practical, this relationship between the small and large languages should be achieved by having the documents that describe the large language refer to the documents that describe the small language instead of duplicating those documents or portions of them. That in turn will require coordination between working groups 1 and 2.

This definitely points at a small language kernel and a larger language standard library (for lack of a better term; perhaps user space, to extend the OS metaphor?). The framing might be lacking, though.

No, it will not

Both charters say:

The programming languages developed by working groups 1 [small Scheme] and 2 [large Scheme] must be consistent, and the programming language developed by working group 1 should be a proper subset of the language developed by working group 2.

Speakin for myself only:

Roughly, small Scheme will be the successor of R5RS, whereas large Scheme will be the successor of R6RS. R6RS is defined in terms of a core and libraries, but the libraries are all mandatory, and most of them were invented by the R6RS standardization process. Many people (I speak as an R6RS supporter who mostly uses an R5RS implementation) thought R6RS was too rich, too featureful, broken in some or all of its features, and generally did not meet their needs.

Unfortunate Framing

It's unfortunate that the steering committee suggested that the small language is the appropriate one for things like education and research, since many so-called 'large' Scheme systems have more than proven their worth for education and research.

"In-looking-out" vs. "Out-looking-in" Education

Presumably they're referring to typical CS "introduction to programming languages" courses where Scheme is really nice because you can show a meta-circular interpreter in a very small amount of code. A "learning how (some) programming languages work" as opposed to a "learning how to program" course. Think SICP.

This remains one of my favourite lessons from undergrad, and Scheme is the perfect language for it because it's small. After/during a course like this the student can look at R5RS and truly say they understand the entire thing. The impact of the lesson is kind of lost if you implement a toy that isn't actually useful, but these courses are often the first exposure students get to axiomatic programming languages. That their "toy" interpreter could actually be used to make a fully powerful extension language (or whatever) is an important part of it.

In a world where CS undergrad education often starts with blindly copying Java boilerplate and "all the way down" understanding is rarely - if ever - part of the picture, I think this is an important thing to preserve.

I don't think it's really realistic to expect Scheme to be used in the pragmatic aspects of education anyway - people in the "real world" simply don't program things in Scheme very often.

Re: "all the way down"

While the metacircular interpreter approach definitely teaches a lot of important lessons, I'm not sure I see SICP as going "all the way down."

The later chapter of SICP that tries to cover low-level hardware design definitely came across as an overtly academic exercise when I was an undergrad. Maybe I would read it differently now, but it seems like implementing a machine simulator doesn't provide the same kind of experience as coding in assembly.

But I digress. My original intention was to ask: might a SICP-style introductory CS course based on Forth be an even better way to achieve this "all the way down" goal?

Not quite all the way down

As far as I can tell, most of these courses don't actually go down to the instruction level like SICP. That's often a separate class based on real machine architectures, assembly, etc (it is here, at least, and a quick googling suggest this isn't uncommon).

I suppose I was exaggerating with "all the way down". "All the way down to the language implementation" is what sets these kind of courses apart from most undergraduate programming courses, which tend to be more along the lines of "here is the language, now do things with it".

You can only fit so much in a course, and it's often difficult to get the student's minds to click into functional thinking in that time, let alone throwing low level assembly-like programming in the mix as well (I have TAed this course several times, many students vehemently resist changing their imperative ways... oh, the horrible things I've seen...)

More important than the 'all the way down', IMO, is reversing the brain damage caused by being initially taught in Java. At institutions that do not do this it's less of an issue I suppose; thankfully this awful practice is becoming less common. Adding a lot of (purely!) imperative content at the end of the course could erase a lot of the functional lessons...

Forth would be an interesting choice if you wanted to design a course to truly go "all the way down", but I think a lot would be lost because Forth is an even more 'strange' language, and you don't really see much Forth influence in popular languages. However, popular languages are getting more functional all the time, so having students wrap their heads around Scheme in a functional way is definitely beneficial.

As always it depends on the institution. Such a course would be interesting if there was room for it; unfortunately language-based courses are often few and far between... often the single (undergraduate) one available is more of a token "things that aren't Java/C/C++" course than anything

Hooray

Sometimes schism is the best thing that could possibly happen. This is one of those times.

Scheme (by which I mean the standard, up through R4RS and maybe R5RS) met the needs of language experimenters and embedded device makers in a very important way; by remaining silent on most of the topics into which there was ongoing research.

That meant that if you had a good idea about how unicode should be supported or about supporting multiple different character encodings simultaneously, or how threads or interrupt servicing or interprocess communication ought to work, or what GUI code ought to look like, you had the opportunity to build a Scheme system using your experimental parts and try it out and see how well it worked.

The standard was a sort of umbrella under which many vastly different experimenters and designers could coexist, having created languages compatible enough with one another to support a large class of useful programs. But that class of programs mostly had to avoid the items on which there was active experimentation.

It also meant that if you needed a scheme system that was tiny or peculiar, you could do it - adopting minimal interpretations of everything the standard left optional and not going beyond it gave you an opportunity to make a Scheme system standard compliant in a tiny memory space. Or, if you had a ternary system using some deeply strange character encoding, you could make a conforming scheme for that system.

But if you weren't a language experimenter, the situation was less happy. The minute you wanted to do something the standard was silent about, you found yourself depending on a single implementation. The class of programs whose semantics were guaranteed by the scheme standard could not use, for example, concurrency or graphical user interfaces.

The fight between the faction that wanted to keep experimenting and the faction who wanted an end to experimentation and standard that would support a much larger (commercial) class of programs including all the comforts of Python, Common Lisp, etc, came to a head in R6RS, and there were lots of rude things said and bad feelings on both sides.

At the time I feared schism. But after due reflection I think it is probably the best thing that could possibly happen to Scheme at this point. R6RS was basically dead on arrival, having failed to garner the support of the most respected people in the community. A schism is the only thing that will allow both camps to move forward.

Syntactic Sugar vs Linguistic Abstraction vs. Libraries

I think this division of labor could be the best thing to keep Scheme moving along while retaining its roots in simplicity.

For various reasons, I'm reminded of CTM's take on kernel languages

  • First, define a very simple language, called the kernel language. This language should be easy to reason in and be faithful to the space and time efficiency of the implementation. The kernel language and the data structures it manipulates together form the kernel computation model.
  • Second, define a translation scheme from the full programming language to the kernel language. Each grammatical construct in the full language is translated into the kernel language. The translation should be as simple as possible. There are two kinds of translation, namely linguistic abstraction and syntactic sugar.

Having macros in Scheme means that the distinction between libraries, linguistic abstraction and syntatic sugar can be a bit more seamless than most other PLs. However, as we found with R6RS, there are many things (such as modularity) that must be in the kernel. It will be interesting to see how that tension between simplicity and utility continues to play out in the Scheme community. I think all PL design is politics, and all politics is about compromise. The attempt to placate the hardliners with the core language and the compromisers with an extended set segregates the two camps. But, at some level, the hard work of unification still has to be done.

Good

I'd call them the "core" and "engineering" branches of scheme.

I'd also like to see the core scheme drop some pieces of engineering baggage that R4RS picked up from the IEEE standard, eg., disjointness of primitive types. Making core scheme be something along the lines of:

  • R3RS
  • plus an extension of syntax-rules to allow first-class macros
  • plus a minimalist record system upwardly compatible with r6rs
would please me.

The semantics of engineering scheme should be given by a reference meta-circular implementation in core scheme plus libraries defs. Cf. Clinger's ERR5RS proposal (Wayback machine).

I'd like the emphasis of core scheme to be "attractive for implementors", in the sense of being open to speculative and surprising new implementations. Jeffrey Siskind has made the case strongly that this is where the true strength of scheme lies. Certainly it is where the language itself came from.

Two questions about administration

Why were none of the current steering committee members, members of the r6rs steering committee?

What is the "strategy committee"?

about admin...

Up until the current steering committee, the steering committee was self-nominating, for historic reasons. The last self-nominated committee gave people a chance to self-nominate as members of the community (you had to write a few original words explaining your interest in Scheme) and that self-nominated community elected the current lot of jokers. As I recall (you can double-check, it's record) none of the earlier s.c. tools stood for election in that process - those fools wanted to leave the game. The suckers that got elected stood for election and thus deserve whatever they get.

I mean, of course, that we lost some very fine people there and the even R6 was a heck of a noble effort (and did contribute much of value) and we got a surprisingly sane choice of replacements, all also very fine people. But it's fun to call them all tools and jokers in the way I'm doing here because (a) it's a reminder of mankind's general fallibility, (b) it's a good counterweight to their well deserved s.c. title in an egalitarian community, (c) it's ridiculous on its face and so not easily taken too seriously, (d) i'm superstitious and don't want to jinx them by singing high praises and invoking Loki to come in and show up the whole community, (e) you know they're going to screw up *something* so let's get the insults lined up in advance, and (f) it's pedagogical: Hey kids, such insults are rubbish and the utility of words like tool and joker is that they are very fine pronouns of a certain color when used in a friendly way. Actually, I think "sucker" is probably the best adjectival-pronoun in the current state of things: they stand a ghost of a chance. g- bless and g- speed to them. and to the Scheme community. In theory we're slower, more deliberative, and ultimately come up with better answers. In practice, we're observably slower and more deliberative.

re "strategy committee" - no idea.

I just wanted an excuse to remind each of us in the Scheme community, not least the steering committee members, that we're all jokers, tools, fools, and suckers. If we keep that in mind, R7 might come out really nice.

-t

Strategy Committee

The Strategy Committee was the committee which set up the current Scheme standardization process, starting in 2002. From the Scheme Standardization Charter:

The Strategy Committee was formed by attendees of the Scheme Workshop in Pittsburgh, October 2002. The draft charter and the committee-selection process were further confirmed by the attendees of the Scheme Workshop in Boston, November 2003.

Thanks

Yes, I forgot that. So the steering committee is a per revision oversight committee that is meant to ensure that the revision editors actually do what they are supposed to do, and to replace them when they fall in battle? And the strategy committee is to make sure that the goals of each revision make long-term sense?

Your r6rs-discuss message seems to support this.

If scheme is split then maybe rename larger variant

It may be a good idea to create a new scheme variant more suitable for larger projects and/or static code generation. However IMHO it would be better to give the Language/system a new name. How about calling it Contrive?

Small vs Large: Three Interpretations

Thus far I've heard three possible interpretations of the distinction between the small and large versions of Scheme:

1. Abstract/Portable. The small language specifies the defining characteristics of the Scheme family of languages (e.g. lexical scope, proper tail recursion, Lisp-1, programs as data, first-class continuations, hygenic macros, interactive REPL), abstracting away all details which are outside the scope of these defining characteristics (e.g. records, data types, particular procedures, i/o, character sets). The large language is a standard instance of the abstract small language, with additional features for writing practical and portable programs (e.g. modules, specific character set, a large standard library). Other implementations of the abstract language would be members of the Scheme family, but programs written in these languages may not be portable.

2. Minimal/Rich. The small language defines a concrete but minimal Scheme programming language, no larger than R5RS, perhaps smaller, which is a strict subset of the large, rich language, in the sense that every program written in the small language is portable to every implementation of the small or large language. The small language could, for example, have fewer data types and procedures. And perhaps no macros or modules.

3. Core/Libraries. The small language defines all of the features of Scheme required for implementing the large language, but nothing else. In a particular implementation of the large language, some data types and operations may be implemented in a more efficient way than is possible using the core language alone, but in principal the core language should be expressive enough to be a platform for the entire large language. Programs written in the core language are portable across all implementations.

Which of these interpretations do you prefer? Are there others? Do we have to choose among them? Why not define them all in the next standard? (Abstract, Minimal, Core, Standard Libraries) The minimal language could be smaller than the core language.

Might Small Scheme be an ocap language?

Browsing this page and the links from here has been one of the more uplifting and hopeful moments I've had contemplating the future of dynamic languages. (Especially by contrast to my despair over the R6RS bloat.)

I wonder whether the future Small Scheme might even be an object-capability language along the lines of Jonathan Rees' W7? I also just heard about WeScheme which seems relevant.

Merger?

I like this idea of splitting Scheme into a minimal educational/theoretical dialect (R3RS+W7+hygiene sounds nice!) and a "batteries included" dialect. Since the latter so similar to Common Lisp, and CLTL3 is underway, I wonder if there's any possibility of merging the two languages. Wishful thinking, I know, but I had to say it :-)

A small language is enough

Almost no one use both small and large languages, so splitting language is just a double work and helps no one. Language + library is better. The need to split the language part is little.

Language part should be something like: R3RS + bug fix + minimalist record system + minimalist module system + well designed macro system

I think record system can be simple as vector + a spectial field for type information.

Module system provide namespace and a method to import functions from other file.

Syntax-rules is simple, but too diffcult to use and can fit into the rest of the language. Maybe syntax-case or some new macro systems can fill the need.

I don't think Unicode is a problem if case sensitive, but interpreters/compilers should accept all characters the machine uses.

More powerful functions can put in a "standard library", like the C language does. The library can update more frequency.

That meant that if you had a

That meant that if you had a good idea about how unicode should be supported or about supporting multiple different character encodings simultaneously, or how threads or interrupt servicing or interprocess communication ought to work, or what GUI code ought to look like, you had the opportunity to build a Scheme system using your experimental parts and try it out and see how well it worked.

The standard was a sort of umbrella under which many vastly different experimenters and designers could coexist, having created languages compatible enough with one another to support a large class of useful programs. But that class of programs mostly had to avoid the items on which there was active experimentation.

SICP-style teaching language

An SICP-style teaching language shouldn't have second-class objects — notably, hygienic macros. (So-called "first-class macros" are, of course, just second-class macros plus a type system to take over responsibility for shaping the second-class restrictions imposed.) It's been my observation that when students in an SICP-style course have trouble with certain features of Scheme, it's because the novice student expects things to work in a simple uniform way whereas Scheme works in an ad hoc way, like the and and or operators being second-class. Fexprs eliminate that non-uniformity; macros exacerbate it. I've thought for years that the R5RS abrogated the right to claim the philosophy of eliminating-weaknesses-and-restrictions when it adopted hygienic macros into the language.