Lisp sans (((paren-theses ((hell)))))

We hear this refrain time and again - "I can't stand the parentheses in lisp". Left to myself, I'd argue that lisp/scheme's parenthetical expressions are simple and elegant, but I do know some good programmers who stay away from lisps purely for syntactic reasons.

Not that I searched deep, but I didn't find any good existing solutions to the problem other than switch language to Ruby or Python and family, so I tried solving it myself and would like your comments on my solution to the lisp readability "problem".

I've described it here and put up C source code for the translator ez2scm on google code hosting as well.

In summary, ez2scm a source translator that translates an indentation based "ez" syntax to "scm" source.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

hm, it seems hard to define

hm, it seems hard to define a function to return the value of an variable, as the variable would eighter come on its own line and I guess that would be interpretated as an application of a zero argument function, or it would be at the end of a line, and interpreted as the last argument of some function.

And how do you write ((a)) then you realy need it?

I added that as an exceptional case ...

You can use a symbol followed by an "empty group" - () to indicate that you're invoking a function with 0 arguments. ((a)) will be written like this -
(a()()). If its on its own line, you can just write a()().

The "exceptions" section is at the very bottom of my description page and describes this case.

ah, ok. But it might be good

ah, ok. But it might be good if you could find a way towards backvards compability. By that I mean that it would be good if ez did ressemble a super set of regular lisp syntax, so that you can freely intermingle both styles.

that's hard, however ...

In a way, you can already intermingle lisp style code to some extent in this ez syntax. You just have to remember that () means "grouping" and therefore repeated bracketing gets you nothing different.

(let [(a (+ x y))] (* a a)) 

Means the same thing in ez syntax as it does in scheme.

Single term expressions

The expression (symbol) can be substituted by symbol even if its on its own line. They mean the same thing. The "empty-group" notation is the one that indicates that the symbol is to be evaluated as a function.

It has been done before.

Even by McCarthy himself (from Wikipedia):

McCarthy's original notation used bracketed "M-expressions" that would be translated into S-expressions. As an example, the M-expression car[cons[A,B]] is equivalent to the S-expression (car (cons A B)). Once Lisp was implemented, programmers rapidly chose to use S-expressions, and M-expressions were abandoned.

And I think I've seen others, but I really like yours. It gives the code a Haskell feel. And Haskell is the prettiest language that I know.

M-expressions and macros

IIRC, M-expressions were not compatible with macro expansions. Which leaves the question open whether any non-parenthesis derivative of lisp can ever be as powerful as S-expressions?

It seems to me that its possible ..

What I've written is a syntax transformer from the ez syntax (lets call it that for now) to lisp s-expressions. It makes no assumptions about what those s-expressions mean. Therefore you can very well write macros in ez syntax that operates on the ez syntax expressions. For example -

defmacro swap2 (x y)
    list y x

will translate to

(defmacro swap2 (x y)
    (list y x))

and since every subsequent occurrence of swap2 a b will be translated to (swap2 a b), the macro will be invoked by the lisp reader.

Why do you say that non-sexpr syntaxes (including M-expressions) are incompatible with macros? In fact, one of the reasons I bothered to write the translator is that I won't lose macros.

Differences with sexpr syntax ...

I intended this "ez" syntax for human entered and read code. I did not intend data I/O to be performed using this syntax. Hence read and write are no longer inverses purely within this "language". However, there's no harm in read and write using sexpressions in order to retain their inverse nature for data I/O and use this syntax only for the load function.

I can very well imagine a pretty printer for sexpressions which will print out in this ez syntax. Therefore theoretically read/write *can* be inverses as well. Its a question of whether its desirable, however. I'd personally prefer the rigor of sexpressions for data transmission.

My purpose in writing this translator is to provide a stepping stone for getting into lisp if you're the kind to be influenced by syntactic sugar. The translator makes it easy to see how much of other languages is merely syntactic sugar for lisp sexpressions.

Just an idea...

...but a more useful facility might be to write translation both ways. That is, it should be possible to take any Scheme code and display it in the form of ez syntax. Not only would you make it easier to write Scheme code, but you should have the advantage of allowing novices easier access to others Scheme code.

On the plus side, this would allow you to run translations back and forth to prove that the translation is correct. On the downside, it's possible that you might end up with a bablefish effect where you run text back and forth between languages and end up with text that loses its original meaning.

you're right ...

I think that should be quite easy to do in scheme using the built-in reader. Like a pretty printer or something.

.. reserved for another weekend :)

thanks visscher :)

I consider that a complement :)
...but its no accident. The : notation for the "dotted pair" is a silent tribute to Haskell syntax :)

If you JUST want to replace the parens

use FORTH-like colon and semicolon for left and right parens, respectively.
so :::expression;;; relaces (((expression)))

Or just use semicolons as right-hand parens (terminators) and newline as an implicit replacement for the left-parens. So this
(blank line)
(blank line)
(blank line)expression;;;

is a replacement for (((expression)))

Or provide FORTH with continuations and macros and voila - some weird Scheme-FORTH (mostly paren-free) postfix chimera.

its not for me ...

I'm happy enough to use the parentheses notation.
I did this for those who would like to try lisp/scheme but are scared shit of the parentheses explosion.

Orphaned

Some others who've previously designed such syntaxes have said something similar. It seems likely that this is at least a factor in the failure of these syntaxes to gain traction — who wants to use something that's disclaimed by its author at the time of its creation? There are many related factors, e.g. who's going to fix problems found in practice.

Dylan

Wasn't your problem alraedy solved a decade ago with Dylan?

Dylan Wiki
D-Expressions

Design decisions

Alternative syntax for Lisp may be one of the great bicycle-shed issues - it nearly is the PL bike-shed question - but I think it's a really knotty question when you try to look at it seriously.

To start with, how should an alternative syntax be better than S-expressions? Should it be more familiar to programmers who are used to mainstream languages? More readable by some more objective standard? Easier to write without editor support? Harder to misindent? These are all sensible objectives, but they're unlikely to be perfectly compatible. There are other features the new syntax might or might not have too. Should the code be automatically transformable into S-expression code that is fit for human consumption? (Remember column width here.) What about the reverse transformation? Should it be compatible with all S-expressional code and data, or only with some existing Lisp dialect, or only with some new dialect?

Links for Alternate Syntaxes for Lisp

- Pratt Parser
- CGOL: Algol-like language that compiles into Common Lisp (update of the Pratt parser)
- Indentation-sensitive syntax
- PLT-Scheme has a "Honu" module which provides Java-like syntax
- CLisp - Conversational Lisp: Lisp extended with Infix operations

Steele and Gabriel: The Syntax Question

Also, I recall reading that Interlisp let you mix infix notation with regular Lisp syntax.

Or try a different language: Pico, JavaScript, Ruby, Dylan (mentioned above), etc.

Gambit's six-script

Gambit Scheme also has a syntax called six-script, some small examples of which can be seen here. [Edit: make that one small example of which.]

More links

More Lisps: Arc (see point 5) and Qi (which has syntax for pattern-matching).

"The Evolution of Lisp" by Steele and Gabriel, pp. 82-5: "The Syntax Question" is excerpted from this.

More SRFI 49 flamage (and my own measly tuppence (part 2)).

I found it rather strange

I found it rather strange that an SRFI existed for this as you point out.

After these discussions and some more searching, I'm getting a bit tired really. As far as I can tell, I've shown a reasonably minimal solution to the "accessibility problem" of lisps - for all lisps in question (well, except the backquote notation which is pending).

So at the very least - if someone says "I'm scared of lisp's parentheses" - I'd say, "well you can use this equivalent thing instead". Paul Graham, for all the respect I have for his popularization of lisp, has the luxury of musing on these issues for an unbounded time interval, but I'd rather have a usable solution now than be asked to breathe hope instead.

Backquotes and other stories

(well, except the backquote notation which is pending).

Backquote notation is pretty critical if you want to support Lisp, and not just a fixed Lisp subset. As I'm sure you know, the standard Lisp macro system, and a number of Scheme macro systems, depend on backquote notation.

S-expressions are a way of representing arbitrary syntax trees, not just (standard) Lisp syntax. For example, it's easy enough to support expressions such as the following in Lisp:

(+ (infix (x + y) * z) (postfix z y / x -))

...where the syntactic rules in each of the major subexpressions are different. A critical question for any general-purpose alternative Lisp syntax is how well it supports such embedded sublanguages. Many such sublanguages are commonly embedded in Lisps, including XML, general logic languages, more constrained rule-based languages, data definitions of all kinds, etc. To support these sorts of applications properly, you're almost certainly going to need a way to define new syntactic sugar, since operators such as ":" and "::", with their specific precedence rules, are only going to be appropriate in specific contexts.

So at the very least - if someone says "I'm scared of lisp's parentheses" - I'd say, "well you can use this equivalent thing instead".

Ignoring the issue that it's not yet equivalent, those people would have some valid reasons for being cautious about adopting such a syntax. The biggest one is that until someone can demonstrate that non-trivial programs have been written and maintained using the syntax, and that it was found viable during that process, its viability is not certain.

For example, one issue I see with the syntax is that rearranging of code seems to change the punctuation requirements, such as when changing a multi-line list to appear on a single line, commas need to be inserted. That's not a great property for a syntax to have, since rearranging of code is common. Other issues like this may not become apparent without significant experience with the syntax.

Useful alternative syntaxes are possible, but it's a much more subtle problem than it might appear at first.

I agree in general.

I agree in general.

Regarding the backquote notation, I do understand its significance, even though I think it can be implemented as a macro itself (though less compact). For example -

(template Hello members of (unlit group) - (unlit-splice (map member-name list-of-members))

which could expand to -

(append (list 'Hello 'members 'of group '-) (map member-name list-of-members))

The indentation based syntax is not as rigorous and consistent as s-expressions, but I don't see people complaining about a similar rearrangement problem with Haskell's syntax. In a way, by making the syntax similar to Haskell, I've tried to borrow the work done by others in proving it in the field.

I believe we always play to the strengths of our tools as long as they don't hinder us. Being able to define infix operator macros is quite a nice feature I think :) If we assume "macro" defines macros like "lambda" defines functions (like in a dialect I use), we can write -

$ := (macro (a b) (list a b))

to define the haskell "apply" operator '$' that collects everything to its right before applying. So that (f $ sum alist) expands to (f (sum alist)). That just duplicates '::', but it shows how you can invent it in the language itself.

good links!

Thanks for those great links! In particular, the SRFI for an indentation sensitive syntax is pretty darn close to what I've come up with, though there are still some rough edges there like lack of infix notation and the use of 'group'. Maybe I should propose this "ez" syntax there instead.

I use PLT Scheme regularly and I don't know how I missed Honu. I can't figure out what I'd use it for at the moment though.

I'm not worried about other languages/runtimes with such syntax. I just want to make all lisp dialects accessible using this syntax.

Prefix versus infix, and code = data

One of the defining features of Lisp is that the structure of the code corresponds in a simple way to the structure of the AST after parsing. This is important when writing macros which manipulate the AST directly as data, as Lisp macros do.

As soon as you introduce infix operators, the correspondance of text to AST becomes something more complex, as the infix expressions need re-ordering according to associativity and priority rules for the operators. This will make it more difficult to write macros. As far as I know, parsers for infix expressions always produce a prefix structure in the AST, e.g.:

 a * b + c => (ADD (MUL (ID 'a') (ID 'b')) (ID 'c'))

Here, the prefix identifiers ADD, MUL, and ID are node identifiers for addition expression, multiplication expression, and identifier. This assumes that ADD and MUL are language primitives. If they are just functions, then ADD => (FUNAPPLY '+'), and so on.

My thoughts on this are that one can "tidy up" Lisp syntax, e.g. by using layout rules to define some forms of grouping, but the basic syntactical structure should use prefix notation, to maintain an obvious relationship between code and AST.

Now I see what's generally

Now I see what's generally meant by "alternative lisp syntaxes break macros". But you do say "to maintain an *obvious* relationship between code and AST". I suppose alternative syntaxes fall into a continuum with varying degrees of obviousness in this connection. I do see that this syntax , by introducing infix notation, lowers the obviousness of this connection when using s-expressions.

.. but I do feel its not to far off - at least not as far off as infix notations with precedence rules, where a * b + c and a + b * c, though formally similar, yield different ASTs.

Lisps use infix notation too btw - the "dotted pair". Syntactically, (a . b) is a 3 term s-expression which gets read in as a cons pair.

It's easy

Introduce syntax for syntactic templates (quasiquotation), both as expressions for construction and as patterns for pattern matching.

As a side effect, this appropach (compared to representing code using naked Lisp lists) allows for hygiene and source location tracking.

Here's another proposal with

Here's another proposal with a lengthy rationale, from David Wheeler. I've only skimmed it.

If you're exploring this area I'd note that something like a majority of lists in program code take the form "(symbol argument)" with a single argument. Something like a majority of the remainder have no sublists. So, to get rid of the most possible parentheses, "f a b c" should mean "(f (a (b c)))", and we'd also like parenless notation for "(f a b c)" where the arguments are atoms -- such as "f a,b,c". Combine with significant indenting and you get

define factorial n
  if = n,0
     1
     * n
       factorial - n,1

I've ended up sticking with plain old Lisp instead of any of these schemes, though.

Wheeler's proposal

I actually like that proposal, but I'd like to see what he'd get to, if it weren't for goal 6 on his "paper":

Backward-compatible. Ideally, it should be able to read regular s-expressions (at least normally-seen formats of them) as well as the extensions. I’m willing to give a little on this one where necessary.

To me, this just seems too tall a goal, it certainly limits the options.

Intuitive equals familiar

.. to quote Jeff Raskin.

Apart from the fact that lisp syntax has been growing on me particularly after I stumbled on DrScheme (I can't praise those guys enough), the following (using ez2scm) looks better to me than Wheeler's proposal -

define (factorial n)
    if (n = 0)
        1
        n * factorial (n - 1)

You can even do the "right thing" -

define (n !)
    if (n = 0) 
        1 
        n * (n - 1)!

compatibility

... and from the compatibility point of view, ez2scm will parse the following correctly as well -

(define (factorial n)
    (if (= n 0)
        1
        (* n (factorial (- n 1)))))

S-Expressions as Sentences.

Take the factorial algorithm in English:

The Factorial of N is 1 if N = 0 otherwise the product of N and the Factorial of N - 1.

Let's make the clauses explicit, using English punctuation and grammar.

The Factorial N:
  If N = 0, then 1, otherwise the product of N and the Factorial of (N - 1).

Now take out the redundant grammatical terms.

Factorial N:
  If N = 0,
     1,
     product N Factorial (N - 1).

But we've lost the ability to analyze product N Factorial (N - 1) because we're missing
those terms now, so make the the clauses explicit again.

Factorial N:
  If N = 0,
     1,
     product N (Factorial (N - 1)).

We can remove the parentheses around N - 1 if the infix notation is sufficient, giving:

Factorial N:
  If N = 0,
     1,
     product N (Factorial N - 1).

And we can change product to infix * if we like, giving:

Factorial N:
  If N = 0,
     1,
     N * (Factorial N - 1).

So here parentheses explicity delineate a sentence, a colon starts a list of clauses and a stop ends one,
and a comma divides subclauses. And 'if', '=', '*' and '-' are special operators.

If we were to make everything into explicit sentences then we'd end up with something like ...

(definition factorial (n)
  (if (= n 0)
      1
      (* n (factorial (- n 1)))))

... and that's what I think lisp syntax really is.

the Mathematica approach

Mathematica takes an approach that may be of interest. The Mma front end, called a notebook, allows the user to enter expressions in algebraic notation, with standard operator precedence in effect. For instance, the user might enter a * x^2 + b * x + c. The expression the Mma kernel will ultimately evaluate is Plus[c, Times[b, x], Times[a, Power[x, 2]]], which is pretty much isomorphic to lisp notation.

The user need not see the underlying expression, but can easily do so:
FullForm[a * x^2 + b * x + c] yields Plus[c, Times[b, x], Times[a, Power[x, 2]]].

The user is free to perform structural operation on expressions.
term = a * x^2 + b * x + c
term /. Power -> foo yields c + b * x + a * foo[x,2]
Here /. denotes the substitution, here of foo for Power.

Structural operations are possible because the lisp-like syntax is maintained, though the user need not deal with it.

Similarly, the notebook provides for separate output forms. Typically, evaluated expressions look something like typeset mathematics, but the underlying expression is remains completely regular. The user can define new input forms and new output forms.

I had no more need when Mathematica came out

for that type of program so I never played with it, but one of the marketing claims I remember was

"It's written in C so it's far faster than the older LISP-based CA systems".