Clojure’s Mini-languages

A nice blog post about "little mini-languages" that form part of the syntax of Clojure.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Standard examples

These are all pretty much standard examples of language extensibility in most Lisp dialects.

They are not all results of

They are not all results of macro extension, but yes, in the post I tried to highlight your exact statement. Maybe it wasn't clear, but I agree with you.

question

Your post could have been named "Examples of language extensibility in Clojure"?

You are right, mini-language is broad.

Mini-language

Of course the term "mini-language" is a bit nebulous, but it's the best I could come up with. I hope the spirit of its meaning is relayed well in the post.

Having said that, at the bottom of the post I ask "What mini-languages does your language have?" I truly am interested in the answer to this being a student of programming languages and would find immense joy in learning about similar features in other languages.

Mini-languages considered harmful?

I found both the focus of your article and the term "mini-language" to be interesting, actually, given that you're one of the better-known Clojure enthusiasts/hackers. My first thought upon seeing the headline was a comparison to C++, which seems to have gained a reputation for being composed of a number of "mini-languages", but in an almost universally negative way; many of the regular C++ horror stories revolve around the "we're using C++ but not X or Y" - everyone picking 20% of the language to use to keep the complexity manageable. I thought to myself, "oh no, is fogus giving up on Clojure?!?"...

Of course, that not being the case, I thought this was an interesting parallel: various pieces of the Clojure language "filling a particular sweet spot", vs (from JWZ) "When you’re programming C++ no one can ever agree on which ten percent of the language is safe to use."

It seems the differentiator is in the "tightness" of the "mini-languages", where each fits a specific and well-defined purpose.... But on the other hand, I've only been hacking Clojure for 6 months, and I avoid C++.

Extending fogus' question, then - "what makes a good 'mini-language'?", or perhaps, "in what sorts of languages would differentiation into 'mini-languages' be beneficial to the language?"

c++'s vs. clojure's mini-languages

i'm guessing that mini-languages go bad when they aren't sufficiently orthogonal?

20% of C++

Re: C++:

many of the regular C++ horror stories revolve around the "we're using C++ but not X or Y" - everyone picking 20% of the language to use to keep the complexity manageable.
Tom Tromey's take:

I often hear stories of disappointment from gdb users. It doesn’t scale well. It doesn’t handle threads well. It only has a vague understanding of the C++ that users insist on typing at it. That was my experience, too, when I was working on C++ (or worse: Java) programs.

But gdb itself doesn’t use any of these features. It is written in more or less plain C. It is single-threaded. It is not too big. It does not rely on many shared libraries. So, as a gdb developer, I find it is pretty easy to forget that it has flaws.

I once heard about a C++ compiler written in C++ where, in order to counteract this same sort of problem, its developers mandated that the compiler use every existing C++ feature.

For fun try to picture code review on that project. “Bob, this patch looks ok, but I think you should use operator overloading and exceptions here.”

:-)

:-)

Simplicity

If there was ever an argument for keeping your language simple, I think this is it! ;-)

Scalable mini-languages

One nice example of a "mini-language" are Common Lisp's lambda lists (aka function signatures).

In a lambda list you have &optional, &rest, and &key to specify optional, rest, and keyword parameters.

So far, so good. However, CL extends this with other stuff: there's &allow-other-keys, &aux, &body, &environment, and &whole. And, of course, DSLs can reuse this convention to introduce even more differentiation.

So, this is a scalable mini-language for all the stuff we might want to pass to a function or macro. (If CL was less of a mudball, and used "more syntax", like say Python, adding the extensions in the previous paragraph would be harder, so I consider CL's lambda lists a well-designed feature.)

P.S. Whether having such elaborate function signatures is a good thing is a different question. ;-)

CL's Lambda Lists and DSL's

While, yes, being a small language level DSL for defining functions and macros, CL's "complex" lambda lists (I don't have a better term for CL's flexibility with function arguments) are also a great feature to have at hand when designing other domain specific DSL's within CL (or any other language with similarly extremely flexible argument list specifications).

For many moons now, I've been muddling through CL's loverly complex lambda list features and trying to figure out how to jam these features into a strongly typed language. The work continues....

Mini languages my language has?

I'll give you a list, but I'm not sure how applicable it is to Clojure. Many of this mini-languages are a notational convenience only because they provide a different lexical syntax.

- I'll start with simple things: I have a mini-language Calc that enable infix notation for arithmetic expressions: Calc[3+4*5] is the same thing as Add (3, Mul(4, 5)).
- A bit more useful is a shell-like language for embedding values in strings: "Here's a $x variable." is equivalent to Join("Here's a ", Join(x, " variable.")).
- Comments are lacking in my base language, but there is a Comment mini-language in which every string is a valid program with no effect whatsoever. So Calc[3+4*Comment[Here is a five:[5]!]] is equivalent to the first example above.
- In the same vein, there's a Literate language which is the same as Comment but may contain an expression of the host (or any other) language it comments on: Calc[3+4*Literate[Here comes a five!:[5]-1] for example.

There are other languages, but they are mostly bigger than these and more difficult to explain so I'll just quickly list some of them:

Delayed
like Clojure backquote
Immediate
oposite of Delayed
Parser
grammar description
Singleton
grammar + instance
Rewrite
term rewriting rules
Translator
a richer version of Rewrite
Goal, Clause, Query
Prolog embedding
TypeDef, TypeClause
a Prolog extension for type system
VirtualAssembly
assembly for the virtual machine

Syntax....

A distinction has to be made here, I think, between lexical and semantic syntax. Lisp macrology allows you to define semantic syntax -- that is, symbols which introduce subexpressions that are treated in a different way by the lisp system from the way procedures are treated. But these are not lexical syntax -- ie, they tend to look like symbols in a framework of fully parenthesized prefix notation, and the uninitiated who are looking for differences in the way things are written, will not be able to identify them as distinctive syntax in any way.

Don't get me wrong. We need semantic syntax, and we need the ability to define it, and having a uniform representation for a call is what allows the first-order code to work with AST transformations. But it does not serve the purpose of lexical syntax in making basic operations within the code clearly recognizable to those unfamiliar with the program. The way semantic syntax works, it means that any subexpression beginning with a token you don't know can mean anything at all, and that drives a lot of lisp newbies crazy.

Lisps have traditionally either skipped lexical syntax entirely, or treated it as an afterthought: 'expression as code for (quote expression) and so on.

I guess I've come to the conclusion that some things, like array dereferencing or property lists in symbols/objects, are so pervasive that they deserve a standard lexical syntax like quote, that are just instantly recognizable without even reading a "word" and remembering what that word does. And, for most programmers, it'll be easier if it's the same kind of syntax that other languages use.

So I humbly suggest, for any new lisp, a few bits of "standard" lexical syntax. If you let people write foo[22] for array references instead of (aref foo 22), and let people write foo.size for subfields in objects instead of (propertyref foo size) or whatever your system uses for object subfields, for example, it will make most programs easier to write, easier to understand, and shorter.