Babel-17 v0.2 spec is out

Hi,

I am currently working on a new programming language which I call Babel-17. It is dynamically typed, mostly purely functional, but also object oriented and supports encapsulation via modules. I have written up a preliminary spec of it, which you can read here. There is no implementation of it yet, but I want to get implementations based on Java and on Objective-C out there as soon as possible. Yes, hopefully you can write parts of your iPad programs in Babel-17 soon :-)

Best,

Steven Obua

Edit: The new version of the spec is v0.21, and there is a reference implementation in form of an interpreter written in Scala/Java. There is also a Netbeans plugin for Babel-17 v0.21 available at www.babel-17.com .

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I'm not sure it would be safe to learn it...

... I might wind up not knowing what "I" means.

Oh, believe me, it's totally

Oh, believe me, it's totally safe to learn it, just give it a try :-D

I think my response to every

I think my response to every new language hereafter is to ask what justifies it? Why should it exist? Why should anyone want to learn it? What does it do that current languages don't? Which languages does it obviate?

Sometimes designing a new

Sometimes designing a new language is just scratching an itch, or a personal journey to enlightenment. The OP should start publicizing with an implementation though, starting to talk about this with just a spec is too early.

Yes, I don't expect too many

Yes, I don't expect too many people to get the language without an implementation. An implementation will make it easier to get across why this new language has its place. Actually I have implemented the language "in my head" according to the spec, but of course there may be problems with the spec I didn't think of yet. My reason for putting out this spec so early is for getting feedback; it is easier to correct mistakes in a spec than in an implementation.

But you are being very

But you are being very optimistic if you believe that any expert language designers will go review your spec for you. Language design is like any other idea building process: you need your 5 second elevator spiel, and maybe a one paragraph abstract for those that are more interested, then useful examples along with a prototype implementation to run those examples. Spec...no one will be interested until the former is done and they actually buy into your language (which is very difficult).

Trust me, if Matz came to us with a Ruby spec before he released Ruby, we would have had no reason to care.

Maybe some will, maybe some

Maybe some will, maybe some won't. I have the spec now, and it doesnt cost me anything to publish it. Long live the Internet :-)

Reasons for Being

- a collections framework is built right into the language; I think you can do this properly and clutter-free only in a dynamically typed language
- the language is dynamically typed but still features true encapsulation of modules and objects
- the language has no side-effects; therefore it also supports concurrency without any worries; it also support laziness, but strict evaluation is the default
- it is a functional language; nevertheless, you can have objects; actually, objects play a big role in customizing the language to your needs
- although it is side-effect free and functional, locally you can write code that looks just like imperative code; you can have your while loops, and so on, and there is a powerful for-loop construct that makes fold-like functionals superfluous
- it has built-in memoization
- all features of the language are carefully designed to live in harmony with each other; this language is just freakin beautiful :-)
- despite all its great features, it is a small language that I plan to implement for those platforms that I program for (which currently is Java and Objective-C, but Javascript would be nice, too)

All of above points are "must-have"s for a language I really enjoy programming in. To my knowledge, until now there was no other language that fulfilled all of those points.

A few comments on the language

U+FF1D is compatibility-equivalent to =, so it's a bad idea to have it mean anything but =. There are many mathematical symbols in Unicode that represent equivalence.

Tuples: I suggest you follow the lead of Pure here, rather than making tuple syntax just a synonym for list syntax. In Pure, () is the empty tuple and "," is a constructor that creates and, if necessary, concatenates tuples. In particular, a,() == (),a == a; if both arguments are tuples, the effect is concatenation rather than constructing a 2-tuple whose elements are tuples (in fact, it becomes impossible to have tuples containing tuples). There are no 1-tuples. All this allows tuples to be represented in the implementation language as vectors, which is very nice.

I suggest that you introduce syntactic sugar for "lazy".

RE: impossible to have tuples containing tuples

Why is this a good thing?

[edit]: I mean outside of a marginal and dubious benefit to implementation.

Algebra and first-order logic

Most algebraic definitions I can think of do not require nested tuples. Think of the definition of a ring, as an example.

Likewise, for finite state automata, you need only a three tuple. for forming a petri net, you need only a four tuple. For a marked petri net, a five tuple. No nested tuples are necessary.

The basic benefit of first-order logic is that you are dealing with atoms and you have a minimal surface area for understanding the component.

How many need

How many need tuple concatenation?

It seems you're suggesting that the ability to work with flat tuples (and one often can, so long as one avoids generic programming) is reason to always flatten tuples. This argument does not seem cogent.

Generic programming might often do stuff like a zip :: [a] -> [b] -> [(a,b)]. In a language that concatenates tuples to avoid nested tuples, the type of zip will vary widely in the case of a or b being tuples. The type for zip will be predictable from the input types, so it can still be type-safe, but it will be much more difficult in a generic context to then utilize a function that needs to extract just the 'b' element from the zip result - it requires reflection to learn more about the 'a' type. I think this is a case that demonstrates utility of nested tuples.

If you come into the function knowing that a,b are tuples, you are always free to explicitly create a larger tuple from them or explicitly extract the features you desire.

Pure and Babel-17 are dynamically typed ...

... so the type-safety objection doesn't apply, but it's true that zipping lists which contain tuples may produce unexpected results.

Pure's tuples

Pure already has (nestable) lists which may consist of arbitrary values, so there's no need for a separate tuple type which does that. Pure's tuples are meant as an alternative, simple aggregate data structure for function parameters and return values, see this thread on the Pure mailing list for details.

Tuples and rings

Think of the definition of a ring, as an example.

Does the definition of a ring require (flat) tuples at all?

Anyway, here it seems necessary to object: What about direct sums of rings? Those consist of tuples --flat tuples, to be sure, but tuples. Now what about direct sums where some summands are direct sums? (And can you say it 5 times fast?) These seem to me naturally to involve nested tuples; and the fact that you can flatten them (to realise the ‘associativity-up-to-isomorphism’ of direct sum) doesn't seem to me to be the same as saying that they aren't there.

You win

David's example is also good.

Thanks for pointing out Pure

Thanks for pointing out Pure which seems to share a lot of mindset with Babel-17. Especially interesting is that it compiles to LLVM which should also be a great target for Babel-17. Anybody here knows if LLVM also targets the JVM as a backend?

I liked U+2260 too much for

I liked U+2260 too much for "not equal", I guess. The U+FF1D felt like a bad idea from the beginning ... I changed both the "equal" and "not equal" symbols in the spec. By the way, the working copy of the spec that develops along with the implementation and user input like yours can always be found at Babel-17s Google Code site .

I like your idea of having a distinction between vectors and lists, as they have different performance characteristics, obviously, but I don't like stuff like "there is no 1-tuple". So probably it will be something along the lines "(a,b,c)" is a vector, "[a,b,c]" is a list, but the syntax for dealing with both is the same. You can convert between them via the "list" and "vector" methods, so for example "[a].vector" gives you a 1-tuple. The syntax for appending an element to a vector like in "(a,b,c) + d" is different from appending a vector to a vector like in "(a,b) ++ (c, d)".

Any idea for how the syntactic sugar for lazy should look like ? I think Scala uses curly braces, but these are already used for sets in Babel-17 .

Edit:Changed source code versioning at google code from svn to mercurial, therefore different link now.

Lazy ~

I've always favored the tilde (~) for laziness sugar if it isn't being used for unary negation.

Ok, let's try tilde. It has

Ok, let's try tilde. It has the same low priority as the lazy keyword, so that you can write large expressions behind it without the need for brackets.

By "there are no 1-tuples" I mean ...

... simply that a 1-tuple of x is identified with x. Pure's predecessor Q supported 1-tuples using Python syntax: () is the 0-tuple, (x,) is the 1-tuple of x, (x,y) and (x,y,) are both the 2-tuple of x and y, etc.

1-tuples sound good in theory, but it's not actually very useful to give them a semantics different from their contents. For example, when you retrieved a single column from a SQL table in Q, you got back an answer of the form [(1,), (2,), (3,), ...], a list of 1-tuples. In Pure, you get the
more plausible [1, 2, 3, ...]. The former is better if you don't know you're only asking for a single column; the latter is better if you do.

1-tuples

do either exist or they don't. If they do exist, you cannot identify them with their content in general. For example, for every tuple "t", "t.length" is supposed to return the length of that tuple; if "t" is not a tuple, then "t.length" may not make sense. If 1-tuples do not exist, then there just is no tuple "t" such that "t.length == 1".

Now, if lists of length 1 can exist, I don't see why tuples of length 1 shouldn't also exist ...

As for the SQL example: This would still be possible, but it would then be up to the SQL engine to make that decision; it could choose instead of returning a list of 1-tuples to return a list of the contents of the 1-tuples instead.

I think removing 1-tuples from the language may seem elegant in concrete usages of tuples, but will probably lead to akward general algorithms if these algorithms use tuples. For example, let's say "t.sort" sorts a tuple and removes duplicates in the process. Then "t.sort.length" might throw an exception, depending on if "t.sort" collapses to a single element and on how that single element looks like.

The reason "no 1-tuples" works well in Pure ...

... is that Pure is a term rewriting language, not a programming language. Although term rewriting languages and programming languages (especially ones like Babel-17) can often be used for the same purpose, there are differences. In term rewriting languages, the only kind of value is that of a term. In programming languages, there are integers, functions, objects etc. This means that in a term rewriting language, it is no biggy to identify a 1-tuple (x,) and its content x, because both (x,) and x are just terms; it is therefore easy to say that (x,) and x is just different syntax for the same term! But in a programming language, (x,) is a 1-tuple, and x can be something else entirely.

Of course a term rewriting

Of course a term rewriting language IS a programming language. I hope it is clear what I mean by differentiating between them for the sake of 1-tuples :-)

Argument passing

In many functional languages, functions take a single parameter of tuple type. In such languages, () [unit] is not a tuple, but is syntactically confused with the empty tuple, and (a) reduces to a (that is: there are no singleton tuples).

The problem in such languages, when there are no 1-ary tuples, is that there is no way to distinguish a function taking no arguments from a function taking a single argument of unit type. Compatibility with existing languages is therefore impeded.

The "no parameters" and/or "no return type" cases ultimately led us to decide that functions in BitC should be n-ary. I'm still not sure this was the right decision, but it did simplify compatibility somewhat.

Function arguments

How to write function arguments seems to me mostly to be a matter of tradition. For example, a lot of people just do not like f(x) being written as f x.

For Babel-17, having no static type system and all, you actually cannot distinguish at call site a function taking no parameter from a function taking one parameter. It is just at the definition site where Babel-17 distinguishes between no arguments and 1 argument. If you want to define an n-ary function comfortably in Babel-17, you need to pack the n arguments somehow into a single one (usually, via tupling).

I am also not sure if that is the right decision, but time will hopefully tell :-)

Not just tradition

OK, so how do you distinguish a function taking no argument from a function taking a single argument of type unit?

It doesn't matter so much when these are concrete types, because "a difference that makes no difference is no difference". It gets a bit thorny when a function takes a single argument of parametric type which then gets specialized to unit. At that point you're going to run into an implementation-layer decision:

1. Share code, in which case you really want all functions to take a single argument so that the code doesn't have to be specialized when the parameter type gets instantiated over unit. The problem with this is that it becomes tricky to specify functions that intentionally take no arguments.

2. Emit specialized code, in which case you have the opportunity to notice that all parameters and return values of unit type can be elided from the implementation-layer calling convention and there's no real problem.

Above and beyond this, the problem with the "all functions are 1-ary" convention is that it doesn't couple well with ML-style application, it's hard to do a good job on error checking, and it's hard to do varargs. These are merely practical issues, but that does not make them unimportant.

Lots of thorns here

This is an important design questions with many far-reaching implications, and one that I've thought about a fair amount. I'm curious what you mean by this comment:

Above and beyond this, the problem with the "all functions are 1-ary" convention is that it doesn't couple well with ML-style application...

As I recall, ML is one of the first mainstream employers of the "all functions are 1-ary" convention. The convention seems to fit very comfortably. What do you have in mind?

The error checking and varargs points, incidentally, are very clear.

No static types

Babel-17 knows two kinds of functions: those that take no arguments, and those that take exactly one argument. In practice, every value in Babel-17 needs to be tagged with its type, and there are just two different tags for functions with 0 args and for functions with 1 arg.

It is important to keep in mind that Babel-17 has no static types.
For example, when you use a no argument function f in an expression where an argument x is supplied to f, you do not get any compile time error. And you only get a runtime error, if the return value of f is not again a function itself.

- error checking: you don't do ML-style type checking in Babel-17 => static error checking in Babel-17 will just not be as good as in ML; you will catch most type related errors by just running your program, though ...

- var args: Var args are no problem at all in Babel-17. For example, via

def f (x, y as ...) = some function body

you can easily define a function taking a variable number of parameters.

- ML style application: In ML, you often define a binary function via

fun f x y = some function body

In Babel-17, you could do something similar via

def f x = (y => some function body)

but normally you'd just write

def f (x,y) = some function body

Note that ML itself often follows that convention, for example look at the type of "op +" which is not int => int => int, but int*int => int.

Anonymous no-arg functions

I just noticed that there is no (really) easy way to define anonymous no-arg functions in Babel-17 right now. You could do it via

begin
def f = some body
f
end

but this looks a little bit clumsy. Maybe it would make sense to introduce the lambda form

(=> some body)

instead of the above. There is already the lambda form

(pattern => some body)

for 1-arg functions, so it would be kind of consistent.

Why have functions of no arguments?

Shap,
Couldn't you fix this issue by saying that there is no such thing as a function of no arguments? As you say, it is a difference that makes no difference*, so why couldn't a zero-argument function be syntactic sugar for a single-argument function whose argument is of unit type?

It does make a tiny difference, I think. Considering divergence behavior, there are two functions from no arguments to (), one which always returns () and one which always diverges. But there are three functions from () to (), the third converges iif its argument does. Is this a good enough reason to preserve the distinction?

There are no functions of no

There are no functions of no arguments anymore in Babel-17 v0.21, at least there is no such value type. But you can still do the following:

def x = random 2
(x, x)

The above code will evaluate not only to (0,0) and (1,1), but also to (0, 1) and (1, 0).
But the behavior differs from a ()-function:

def x = random 2
val y = x
(y, y)

will always evaluate to either (0,0) or (1,1).

Val and Def?

Is the difference here the choice of 'val' vs. 'def' to describe the variable?

Yes, val is for

Yes, val is for non-recursive variable bindings that are evaluated immediately, and def is for (possibly) mutually recursive function definitions. Thinking again, it really IS like a def without a parameter is syntactic sugar for a def with a default-parameter () if you convert each use of the defined function to a call with the default-parameter.

So the above examples can be seen as syntactic sugar for

def x () = random 2
(x(), x())

and

def x () = random 2
val y = x ()
(y, y)

From an implementation point of view, there is a type of functions with no arguments. But values of this type only live within environments. Whenever you take such a value out of the environment, you apply the default argument.

Douglas, I think your

Douglas,
I think your comment hits the nerve of what is going on in Babel-17 with parameterless defs, see my comment in reply to dmbarbours comment:

There is such a thing as a function of no arguments, and you could view this thing as a single-argument function whose argument is of unit type and which is specially marked, such that whenever you encounter such a value, you apply it to the unit value. In this way you also make sure that the argument of this thing always converges (because it always is the unit value).

Babel-17 v0.21 is out.

It took me longer than I expected, but finally both a Babel-17 spec (v0.21) and a reference implementation together with a Netbeans plugin are available. The home of Babel-17 is www.babel-17.com. Enjoy!