Redhat's New Language

Redhat has apparently unveiled a new JVM language. Slides here and here. I've got a plane to catch so can't do a proper post, but thought some might be interested. There is nothing more fun than tearing apart someone else's language design, right?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

The language is apparently

The language is apparently called Ceylon. Here's a link to a fairly interesting blog post, which also links to mirrored slides.

Got tea in my coffee

Looks like a cleaner version of Java.

Looks OK, but what for?

Looks decent for an OO language, although I didn't see anything you wouldn't have in Scala already. So I'm not sure why anybody would want it.

Some clear minusses, too, like closures but no function expressions -- motivated by this bullet from their slides:

Java’s syntax is rooted in standard, everyday mathematical notion taught in high schools and used by mathematicians, engineers, and software developers
-- not the lambda calculus used only by theoretical computer scientists

I mean, huh?

I just read that myself.

I just read that myself. Total "what the hell?" moment for me too.

Most of the points on that slide caused a similar, albeit tamer reaction.

Not in Scala

That slide made me cry. But

although I didn't see anything you wouldn't have in Scala already

Union types without the ceremony of an algebraic data type (or Scala's not quite ADT equivalents) sounds pleasant. There's some allusion to "metaprogramming". There are no details on exactly what aspect of metaprogramming he means, but it's likely that he's talking about something that Scala doesn't do particularly well. And one slide talks a bit about modules and, given the surrounding context mentioning OSGi, I gather he's talking about having deployable blobs that can be composed without DLL (or jar) hell.

Also, as always, programming language design is as much about what you don't put in as what you do. Some people think that Scala's type system is a bit much, or that implicits are too magic, or that Scala's relatively unconstrained identifier naming (usually slightly mistermed "operator overloading") is a source of abuse.

Same thing

Union types without the ceremony of an algebraic data type (or Scala's not quite ADT equivalents) sounds pleasant.

When you can only union class types (or singleton objects), it boils down to the same thing. (Fortunately. Everything else would have been a bad move.)

There should be a law

There should be a law against this sort of thing.

Look at slide 6

Especially if you read a bit further and notice frustration about "Lack of support for first-class and higher-order functions". Looks like an inherent contradiction...

Not really

Did you notice the word "syntax" in Andreas' quote? (I didn't on first read). They only allow named function definitions (or parameter function defs). It's not a bad design for this kind of language IMO.

"First class" functions

Slightly tongue in cheek: If I generally don't have to name other expressions but I do have to name functions then are functions really "first class"? Seems like second class treatment to me.

You don't have to name every

You don't have to name every function -- only function literals (or you could say the problem is that they don't have function literals). Currying and partial application work as expected. I think that's first class enough to let them put the phrase in their marketing material.

Yes I did!

and I also noticed that if you replace "perform" by "lambda" in slide 29 then nothing will change in the syntax w.r.t. the lambda calculus, and neither will Cylon's semantics change (I silently assume that this new "lambda" method will have the scope of "repeat" and all local bindings will be statically captured in the closure).

I will not even start speaking about references to Java's syntax being rooted in standard mathematical notion...

Semantics behind slide 29

If I understood correctly, "perform" isn't a keyword. The syntax on slide 29 was sugar what's on slide 35 ("repeat" takes two parameters: "count" and "perform"). So this isn't just lambda calculus with a different keyword.

I also didn't know quite what to make of the claim of "standard mathematical notation". My guess was that they're talking about currying. Mathematicians may often write "cos x" instead of "cos(x)", but not many will use "f x y" over "f(x,y)". But I agree this was a rather curious claim. Don't misunderstand me. I found the quote Andreas mentioned and some of their other claims a little puzzling but not necessarily contradictory.

It isn't a keyword,

but what does this change? I think any identifier would do ... its probably the type system (and I sincerely hope its not the declaration of "perform" in "repeat") that makes the second parameter be treated like a lambda.

Now that I'm thinking about it, you can even get rid of the lambda keyword entirely in any typed lambda calculus (as long as the type of the body does not coincide with the expected function type, which is not likely). OK, you still need to do binding though for non-thunk lambdas, so arguments have to be named. However, even then one can adopt /bin/sh and m4 conventions of naming them $1, $2 etc. Or default to type names if they are different etc. etc. etc.

However, calling lambda "\", anything else or not even reifying it at all, doesn't change the fact that it still is lambda calculus.

About the claims: it could be that the target milieu is just so foreign to the idea of the formal approach to computer science that they have to be blindfolded first, then lured by hypes like first-class, only to discover in due time that they have merely learned a half of Scheme.

Still don't follow

The way I understand it, it's not the type system that decides how to treat it. It's the syntax. The type system will just check that the types match. For example:

int foo(int f(int));  
int bar(int n);

foo(5); // type error parameter f: expected function, found int

bar()  // type error parameter n: expected int, found function
  n(int x) {
     return x;
  };

bar(5); // ok

foo()   // ok
  f(int x) {
     return x;
  };

foo()   // error: foo takes no parameter named perform
  void perform() {
  };

So the syntax differs somewhat from that of lambda calculus. You can only define function literals as part of a declaration of either a new symbol or a parameter.

Yes, that's what I first thought

but then I realised that it wouldn't make any sense to make repeat(3, hello); (slide 26) much different from repeat(3) hello; or repeat(3) hello() {...}); "perform" will be the name bound to hello() inside "repeat" body, and it just coincides with the name chosen on slide 29. And I suppose you could rephrase slide 35 as repeat { times = 3; perform = void hello() {...}}; I simply can't imagine that this language makes a distinction here. For what purpose?

IMO it works like this:


repeat(3, hello);  // fine, perform = hello (positional parameter)
repeat(3, perform = hello); // fine

repeat(3)         // Is this syntax legal?  Maybe.
  perform = hello;   

repeat(3) hello; // Error: bad syntax

repeat(3)         // Error, repeat has no parameter named hello
  void hello() {...};

[Edit: In other words, I speculate that you can only define parameters by keyword outside of the parens, not positionally]

Why?

I didn't answer your question as to the purpose -- I think it's to support a style of callbacks like this:

createWindow(1024, 768, red)
  void onResize() { ... }
  void onClick(int x, int y) { ... }
;

That we're naming parameters and not passing them positionally is important as you get more parameters. The only problem I see with this idea is that presumably this works with expressions:

"Hello, " repeat(1) void perform(){return "world";} "!"

That's a little crazy looking to me.

Well if its not positional

then I wonder how many "mathematicians, developers and software engineers" would consider this language totally bogus and defeating the purpose of replacing Java (it would then be even farther from Java than, say, Haskell).
Making named argument applications (with {}) and positional applications (with ()) and curried applications (Cylons: does it also work with {}?) behave differently is going to confuse a whole lotta people.
On our side, we can only speculate now, but you agree that even in this weird language repeat { times = 3; perform = void lambda (...) {...}}; would be perfectly possible?

Clearly I'm speculating

Clearly I'm speculating (I've just seen the slides with no audio), but no, I don't agree that your example string is valid syntax. What I think would be possible is:

repeat {
  times = 3;
  void perform(...) {...};
}

although I didn't see

although I didn't see anything you wouldn't have in Scala already. So I'm not sure why anybody would want it.

Even if this is true (and I am not sure it's true, Ceylon has many little features), Scala and Ceylon are targeted to different programmers. Ceylon is designed to be easy to understand and small. Scala has a complex type system and other features that are not easy to understand for the programmers of commercial applications Ceylon is designed for.

Not sure I buy this argument

The average C++ or C# programmer probably doesn't understand the complexities of partial template specialization or type parameter variance. Nor do they need to.

When it comes to building expressive and easy-to-use libraries, however, it is important that expert programmers have access to these tools to build better libraries. The Scala team make this same case when showing how they used higher-kinded generics to implement their collections library.

The challenge in designing a language with that kind of depth (I hesitate to call it complexity) is to ensure that programmers face a gradual learning curve, and can be productive without first learning all the details.

Alternatively, you can just make your language easy to learn/understand by making it shallow. In doing so, however, you run the risk of creating a language in which there can be no experts. How can I become an expert in a system with no depth?

As an aside, the Ceylon project invites comparisons to Scala by casting aspersions on a "lambda calculus" straw-man. I do worry, however, that these presentations show no particular signs of having studied or undestood the actual features, strengths and weaknesses of Scala.

For example, the slides about union-based ADTs tout that adding a new case will make all your existing switch statements fail to compile, forcing a programmer to go back and make non-modular edits throughout their codebase (and that's a good thing?). Meanwhile, Zenger and Odersky published "Independently Extensible Solutions to the Expression Problem" back in 2005.

Yes, it's a good thing

For example, the slides about union-based ADTs tout that adding a new case will make all your existing switch statements fail to compile, forcing a programmer to go back and make non-modular edits throughout their codebase (and that's a good thing?).

If you directly edit a closed ADT to add new cases, then static detection of places where you're not handling the new cases is a good thing. I'm sure it works the same way in Scala. You're raising a separate question of what mechanisms they have to create open ADTs.

I see your point

In that case they are just touting that their language isn't broken by design - something that shouldn't need to be said, but often does. Wasn't an inexhaustive "switch" in a C program once responsible for a patient receiving a lethal dose of radiation? Yes, exhaustiveness matters.

The issue it that the slides posit the situation as "suppose you later want to add another case?", which to me sounded like a maintenance issue that might come up after you library is deployed. In that case the genie is out of the bottle, and you can't possibly make a closed-world assumption and track down every pattern match on the type.

If you need to add a case to your "closed" ADT, then it might be a sign that it isn't really closed, and you should be using a more appropriate tool. Given that Ceylon purports to be for large-scale business software, tools that rely on closed-world assumptions seem to be of dubious value.

I don't program in that world, though, so my conjectures are of similarly dubious value.

Still, I suspect that between work like "Independently Extensible..." and "Modular, Statically Typed Multimethods" we have the building blocks needed to address exhaustiveness for open data types.

Sometimes it's what you want.

I think that yes, in many cases, you'd want some way to preserve backwards compatibility with clients using the old algebraic type. But there are plenty of cases where you want clients to break (especially if your type is not part of a public API).

For example, let's say you enhance your database query language with a new type of expression. If the AST is an algebraic type then often you just want to add in the new option and have the compiler tell you where you need to fix things.

Also, the fact that they're touting this is probably because most of the target audience is familiar with Java enums, whose uses aren't checked for exhaustiveness.

Actually, there are function expressions ...

... just not anonymous function expressions. That's a relatively trivial deficiency, though one might argue that it's bad pedagogically, because it leads you away from the realization that functions are values: f(x) = x * x is just as much a value as 2.

"as much a value" is debatable

Mathematically speaking ok, but the programming languages I know can compare 2 with 3 but you can't compare functions..

So are functions 'as much a value' as 2?
in theory yes, in pratice no.

Equality

A computable definition of equality isn't necessary to making something a value. Are infinite lists (streams, iterators, whatever) not values?*

Isn't everything non-reducible but representable always a value?

Aren't streams etc. by classic account not strictly speaking values, since they are infinitely reducible? (lets say they are co-values;-) They can be nevertheless in principle compared for equality via supercompilation of their "extract" and "duplicate" transformations. See the Equivalence paper from 2009 for some hints.
This besides the obvious: overly restricted and much less useful identification of functions as values (function pointers, instruction opcodes etc.) in any modular compiled language: comparing them makes as much sense as comparing floating point results for equality (which doesn't make them non-values).

I don't see how defining an

I don't see how defining an equality function for every type is a problem. In fact it makes the language more uniform, even if the equality function is not necessarily semantically useful to the program's logic. E.g. Virgil does have first class functions in the form of delegates and two delegates are equal iff their bound objects are equal, their bound type arguments are equal, and they refer to the same method implementation. (A natural extension for anonymous functions is that they are equal if all bound values are equal and the syntactic definition points are equal).

I think it's entirely natural to think of every data type as an algebraic data type that has (at least) an equality operator.

If your mental notion of function equality requires that x == y => f(x) == f(y) , then you are already on the wrong track if f has side effects, and even if not you end up with something not computable.

As for floating point equality, yes, equality often makes sense for exactly representable numbers like 0, 1, -1, etc.

Leibniz

Depends on what you expect from your equality operator. I expect it to check equality. And according to Leibniz's law, that essentially means observational equivalence. That is not decidable for functions, so the language shouldn't pretend so (but it is for floats, actually).

(Your comment about side effects is a red herring, observational equivalence is perfectly well-defined for functions with side effects, too.)

Equality in Kernel is

FWIW, equality in Kernel is an equivalence relation that implies operational equivalence. That leaves room for it to be decidable. (And uniformity is the point — Kernel treats testing via equality as a right of first-class objects.)

Not only that

I think it also has an (probably, unintended) consequence of allowing recursive function expressions: something you can't have with real anonymous functions without an exiplicit fix in the caller.

Do as I say, not as I do

Type system slide 8: "The temptation to overload a common symbol like + to mean something that has nothing to do with addition is overwhelming for most library authors"

Keynote slide 24: Overloading + to mean something that has nothing to do with addition (sequence concatenation).

So sequences are numeric types?

Agreed

In D the operator to concatenate lists,strings is '~' which I think is a good idea.

In Ceylan, they will have an issue with concatenation: the space is used to concatenate strings, which is nice for a readability POV but I don't think that they can use this 'space' operator for generic concatenation..

Fortress found readability issues.

The Fortress language used to use juxtaposition for string concatenation but they found it lead to readability issues:

"Hi, " name ", the " job ", how are you?"

When scanning over that line quickly, it's hard to tell which tokens are part of string literals and which parts aren't.

That said, syntax highlighting should make this problem go away, but maybe people aren't quite ready to take syntax highlighting as a 100% given?

Optimizing for the wrong thing

Having a pithy syntax for string interpolation might be a big deal for a scripting language, but if I'm programming a more serious application it isn't going to see much use.

I'd rather see a well thought out design for localization/internationalization of text, which will almost certainly involve optimizing for different criteria.

Hey, just for kicks here on

Hey, just for kicks here on LTU, I'll throw out the notion that variant types and cases over their values in some language might well provide only a *programmer exercisable mechanism* for writing safe(r) software (by exhaustive case analysis) rather than just language compilers guaranteeing exhaustive safeness of such 'case' or 'match' style-constructs.

I for one am much enamored of constructs such as "head some-list" or "let (x::xs) = some-list in ..." or even "some x = dict.(key)".

Expressions, and thus programs, that "can't crash" is no particular interest of mine, nor language designs that support this goal. Yeah, you can probably guess I don't write software for the space shuttle or high volume automated trading systems :-)

Just trying to stretch out the valid language design space a tiny bit.

- S.