Eleven Theses on Clojure

Tim Bray shares his conclusions about Clojure.

To get your juices going: He claims "It’s the Best Lisp Ever"! (And that tail call optimization is a red herring.)

I have been contemplating a Clojure department for awhile now, but heck, we may be too far behind the curve by now...

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

My first small taste...

did indeed have Clojure slotted as Best Lisp Ever. Lisp+JVM+(persistent collections)+STM+Actors-(1950s API design) is a rather powerful set of arguments. The only problematic omission I found was letrec, and
that's only a matter of implementation effort

letfn

Clojure does have letfn, which is letrec for functions only, arguably the most useful case.

The curve

Yeah, I do think we're too far behind the curve (but better late than never, probably). That said, I think the fact that we haven't discussed Clojure too much hints at an interesting aspects of its design. In terms of PL ideas, there isn't a whole lot new in Clojure, at least as far as I've seen. And I do not mean that as an insult.

LtU tends to focus on research and what we might call PL pragmatics, and so when we discuss new languages, it's usually in that context. We highly value originality, at least in our discussions here. What Clojure shows, to me, is that there are niches for languages that study history, take a bunch of ideas that are relatively well-understood and put them together in a really coherent, consistent and well-implemented way.

(This is sort of the point I was making in the Go thread, in response to "Why another language?" Comparing Clojure and Go in this light may be instructive...)

It's also very interesting to compare Scala and Clojure. It seems that they attack the same problems in very different ways. Scala has many innovating and interesting new ideas (and has gotten a lot of attention on LtU), but it sometimes feels messy and cobbled together, and I'm sometimes unsure if all of the new ideas will really prove their value. Languages can be ruined by the designer adding even one unproven idea that turns out bad. (In Clojure's case, I'd say the jury is still out on STM, but I haven't been paying close attention.)

So, classicism has its place. Java itself had a strongly classical feel, and like it or not, it has certainly proved its value.

Clojure is at least as innovative as Scala

STM and various other concurrency tools built-in, a standard library based on persistent data structures, a new macro system (really a new way to handle gensyms).

Programming language research is currently (and unfortunately in my opinion) focused on statically typed languages and static type systems in particular. This is easier because you can prove various things about your type system instead of establishing its usefulness. But Clojure doesn't have a static type system...

Genuine innovation

The difference seems to be that Clojure did not invent these features, but picked them up from various existing languages and literature, combining them in a neat way. Scala on the other hand features some genuine innovations in its object and type system (besides an interesting combination of features invented elsewhere).

Not to say that either approach necessarily results in a better language, but it explains why Scala may be more interesting from a research perspective.

Recombination is Innovation

To say Clojure was not genuinely innovative on the basis that it combines existing features in a neat way strikes me as equivalent to saying: "Shakespeare's plays were not innovative; he did not invent any new letters, but picked them up from existing literature and combined them in a neat way." or "Mozart's music was not innovative; he did not invent any new instruments, but researched the existing ones and combined their noises in a neat way."

IMO, recombination is no less a form of genuine innovation for having failed to include new primitives.

In Language Design, there is a great deal of innovation in deciding which subset of primitives to support, for layered languages how those primitives are permuted (i.e. higher-layers can call lower layers but not vice versa), while aiming for certain emergent interactions.

In any case, Clojure did 'invent' a few features - notably, its commutative 'agents' and their interaction with STM is (AFAICT) unique to Clojure. And, as Z-Bo mentions, STM is influencing the major Clojure libraries in some innovative manners.

In Language Design, there is

In Language Design, there is a great deal of innovation in deciding which subset of primitives to support, for layered languages how those primitives are permuted (i.e. higher-layers can call lower layers but not vice versa), while aiming for certain emergent interactions.


I agree with the sentiment, and I think that's a very nice way of putting it.

Languages vs. features

Regarding using new (innovative) or old (tried and tested) features, I would just like to add a quote from "Hints on Programming Language Design" by C. A. R. Hoare that I like.

"The language designer should be familiar with many alternative features designed by others, and should have excellent judgment in choosing the best, and rejecting any that are mutually inconsistent. He must be capable of reconciling, by good engineering design, any remaining minor inconsistencies or overlaps between separately designed features."

I'm not sure that I agree with it fully, but I do think that an approach like Clojure's has a higher chance of succeeding in producing a clean language.

Innovation

I'd actually rate Clojure's interaction between STM and it's agent framework as innovative. Very lovely effects are possible there,
with a very clean programming model for imperative reactive control
flow. I don't know of any other prior art on that.

Otherwise, I'd agree that Clojure's great gifts are in quality and
simplicity, rather than innovation per se.

--Dave Griffith

What are some of the things

What are some of the things in Scala that are new?

Programming in Scala is much more like programming in conventional languages than programming in Clojure is. The parts may not be new, but the combination certainly is.

New in Scala

Extractors, allowing pattern matching over arbitraty objects and higher-order pattern matching. Unification of active and reactive actors. In-language XML literals and patterns. Combination of self-types, mixins, and abstract types to allow type-safe dependency injection and module composition. For-comprehensions implemented
over arbitrary monads.

That's all I can think of. Some of these have some academic prior art, but not really much in mainstream languages.

Extractors are a variation

Extractors are a variation of Views (by Wadler), just like Clojure's agents are a variation of actors. Unification of active and reactive actors, this is not a language innovation it's an implementation innovation for a constrained environment (like recur in Clojure). If you have coroutines you don't need it. Clojure has XML literals, just with a different syntax. Combination of self-types, mixins, and abstract types to allow type-safe dependency injection and module composition. As you say it's a combination of old features. For comprehension is a macro expanding to do-notation that Haskell users have speculated about for a long time.

If extractors are new but Clojure style STM is not then one isn't using the same standard for both languages.

I have a feeling that trying

I have a feeling that trying to give marks for innovation is going to lead in an unproductive direction... Simply enumerating the innovative ideas, however, will surely interest many.

Of course people debate

Of course people debate novelty all the time. That's why it's good to have peer-reviewed journals and conferences to give some sort of objective assessment. Scala's approach to component abstraction via self types and abstract types appeared in OOPSLA 2004 and Scala's extractors appeared in ECOOP 2006. So I am pretty confident they are indeed new.

Scala's extractors share only a superficial resemblance with Wadler's views, but they do have a lot in common with F#'s active patterns. In fact the two were developed concurrently and collaboratively. Leaving these points aside, the main problem Scala tries to tackle is the unification of object-oriented and functional programming in a statically typed language. I don't think there's another mainstream language that goes further than Scala in this.

I do realize that there is quite a bit of Scala bashing going on, typically from people who are firmly on one side of the OO/FP debate and who don't want to hear about a unification, or else from people who strongly prefer dynamic over static type systems. I take this as a sign that we're successful.

let (|Even|Odd|) n =


let (|Even|Odd|) n =
    if n % 2 = 0 then
        Even(n/2)
    else
        Odd((n-1)/2)

object Even {                              
  def apply(x: Int): Int = x * 2
  def unapply(z: Int): Option[Int] = if (z%2 == 0) Some(z/2) else None
}
object Odd {                              
  def apply(x: Int): Int = x * 2 + 1
  def unapply(z: Int): Option[Int] = if (z%2 == 1) Some((z-1)/2) else None
}

view int ::= Even int | Odd int
  in n = Even(n/2) if n%2 = 0
       = Odd((n-1)/2) if n%2 = 1
  out Even(k) = k*2
      Odd(k) = k*2+1

Extractors & active patterns are certainly not exactly like views. But the point I'm trying to make is that Clojure isn't exactly a collection of features from existing languages and the literature either. Some examples:

STM: Clojure takes MVCC which comes from databases and applies it to STM.

Seqs: lazy streams instead of iterators everywhere.

Persistent data structures: The idea of persistent data structures isn't new, but the particular data structures in Clojure are. Vectors and maps in particular. They are a functional version of imperative data structures from the literature (e.g. http://lamp.epfl.ch/papers/idealhashtrees.pdf)

In quotations symbols ending in # are converted to unique fresh names. This removed a lot of manual gensyms from macros.

There is a new multimethod system (not like CLOS) http://clojure.org/multimethods

Javascript had E4X

Javascript had XML literals, in the form of E4X, before Clojure was even a twinkle in Rich Hickey's eye.

Their pluggable compiler

Their pluggable compiler allowed them to add a type-directed selective CPS transform to support delimited continuations and efficient blocking operations. That was pretty novel IMO. See their other papers for more examples.

"Innovative"

Sorry, but I don't agree. As far as I know, all those things are mostly best-of-breed implementations of well-understood ideas. STM is the one thing that I would probably concede, although as far as I know Clojure is not attempting to advance the state of knowledge regarding STM.

It really seems to me that Clojure's focus is: take ideas that are proven and solid, but not yet mainstream (and not yet in the same language), implement them really well, and put them together in a really consistent way.

Again, I really don't mean to suggest a value judgment here. I really have great admiration for Clojure, and may eventually come to prefer it over Scala.

(Your comments about PL research are, in my view, something of a separate discussion. I'm definitely sympathetic to your view, although I do like type theory as well. But I don't think there's a whole lot of research in Clojure by any definition. But maybe I'm wrong.)

although as far as I know

although as far as I know Clojure is not attempting to advance the state of knowledge regarding STM.

This depends on your perspective.

If you consider using STM in a graphics application, then Clojure is probably breaking new ground -- and Rich Hickey has already stated his thoughts on best practices for using STM in graphical applications. i.e., what works well, what ways to augment the default behavior of STM in Clojure, etc. to avoid frame rate problems.

Clojure's implementation of

Clojure's implementation of STM allows for nested transactions on one hand. On the other, it therefore doesn't allow for orElse and cousins. This is an interesting design choice because it pushes people towards necessarily high-level use of STM for transactionality. The plans to integrate it with other transaction systems on the JVM look very promising for a certain sort of macro-level logical structuring of certain types of applications. The STM is also interesting because it should be easily extensible/configurable for different sorts of schemes regarding which threads to retry.

On the other hand, it doesn't provide a progress guarantee as far as I recall, so the transactional model is not perfect -- i.e. it is possible to code up a denotationally sound model that fails. A standard example would be two variables -- a and b -- that should never be equal, but due to multiple inconsistent transactions, may briefly be so. If those two variables are equal, the transaction loops. Since it loops, it never commits, never detects that it read a and b in an inconsistent state, and never manages to retry and make progress.

I'm not sure if the almost-but-not-quites of clojure's stm will aris in real world use, or if the abstraction leak won't matter, but nonetheless it exists.

The scenario you describe

The scenario you describe cannot happen in Clojure's STM. Clojure's STM uses MVCC and provides snapshot isolation to transactions - a transaction will never see inconsistent state. Being MVCC-based, it is subject to write skew, and provides an ensure construct that allows one to tag dependent reads, with more concurrency than the dummy writes usually used to address write skew. Livelock is a theoretical scenario, but it is not a non-blocking design, thus mitigating churn. Actual livelock is avoided via retry limits and timeouts. There is, as you mention, as-yet-unrealized potential for configurable contention resolution policy.

Thanks for the reply! I

Thanks for the reply! I should have realized that about MVCC earlier -- the tradeoffs are quite different from other STM mechanisms.

Implementation vs. Model

RE: "it doesn't provide a progress guarantee as far as I recall, so the transactional model is not perfect [...]"

The situation you name is an implementation issue.

It could be fixed by alternative implementation strategies (for example: MVCC, contention-detection on read/write, or contention-detection by a watchdog thread) without violating Clojure's transaction model, and without breaking existing code. To guarantee progress in the face of livelock, one can use contention managers, introduce pessimism on heavily contested resources, or interact with the scheduler to forcibly serialize transactions that conflict.

In this, STM is a bit like GC - implementors have much freedom to play or add intelligence to the implementation without violating the semantics.

As per Rich's reply above, I

As per Rich's reply above, I was wrong about what the issue is, which is not inconsistent views, but rather MVCC and write-skew (and of course livelock).

However, his point about the introduction therefore of the "ensure" construct shows that unlike gc, which we can treat as not changing semantics, different STM implementations with different isolation levels (livelock policies aside) actually do mean different semantics.

Unless you have complete serialization (and a typed guarantee of such) there's always going to be some abstraction leak. The interesting question is how much, if any, such leak gets in the way -- i.e., when "good enough" is really good enough.

Isolation Levels

The use of 'ensure' is essentially a programmer annotation to allow some extra concurrency at the cost of potential loss of isolation (depending on how it is used). With the GC comparison, I would relate it to 'weaken' references - a programmer annotation allowing the GC to collect an object earlier under certain circumstances even if it is still referenced. One could also relate it to a 'memoize' annotation for pure functions.

For these cases, an implementation that simply fails to acknowledge the annotation would still be 'correct'... just not optimal.

Many different STM implementations - not just MVCC - are perfectly capable of taking advantage of varying isolation levels. Indeed, historically many (maybe even most) transaction systems do allow programmers to trade a bit of isolation for some extra concurrency.

Dirty reads, done dirt cheap. The ability to weaken transaction-based isolation on reads is especially useful when searching a mutable construct that obeys some structural rules (e.g. a sorted list) prior to making a tweak. One can use the structural rules to eliminate conflicts, thus reducing the need for transaction-driven isolation.

I would not call this isolation tweak an abstraction leak, nor would I put responsibility on the particular STM implementation (which would suggest a change in implementation would require changing Clojure code). Rather, one might place responsibility for it on Clojure's goal of allowing programmers to achieve performance goals from within Clojure.

The use of 'ensure'

The use of 'ensure' is essentially a programmer annotation to allow some extra concurrency at the cost of potential loss of isolation (depending on how it is used).

I'm no STM guru, but I'm pretty sure you have this wrong. The use of MVCC allows for extra concurrency, and the use of "ensure" restricts this extra concurrency. Programs using a different model of STM with stronger isolation properties can be rewritten using Clojure's STM, although you would need to take great care to "ensure" the correct reads.

Oops.

The use of MVCC allows for extra concurrency, and the use of "ensure" restricts this extra concurrency.

To say MVCC is offering greater concurrency would be an error in terminology. What MVCC offers is a great deal of parallelism while still achieving isolation semantics. (Recall: concurrency is a semantic property, parallelism is an implementation property.)

There are, of course, plenty of transaction semantics that do weaken isolation and allow certain races in order to allow concurrency between transactions. (Whether this is an 'improvement' is arguable...)

Therefore, when I read Rich's statement that 'ensure' allows for greater concurrency, I assumed that it must weaken isolation. One cannot reconcile concurrency (a semantic property) with isolation semantics, after all.

This assumption seems to be in error: (ensure x) basically a read with the extra annotation "I'll probably be writing x later, so keep others away from it." This restricts parallelism while aiming to avoid unnecessary conflicts, and significantly improves parallelism relative to a dummy read-write (ref-set x @x), which must have become an idiom used by programmers of Clojure aimed at a similar effect.

Rich was incorrect to use the word 'concurrency' to describe what was improved (because the transactions are gaining no concurrency semantics) but I failed also: I should have looked up in greater detail what Rich was attempting to explain.

Matt, It would be good to

Matt,

It would be good to see some specifics of the things you don't like in Scala. I know of some things I would like to improve, but I am not sure we are talking about he same things.

I come to praise, not to bury

It would be good to see some specifics of the things you don't like in Scala.

I'm not answering for Matt, but I understand and share his overall sentiment, and I think Clojure is a good counterpoint to understand the "criticism".

I like many, many things about Scala (and to be honest I'm more likely to use it in production than Clojure, since I like the power and expressiveness of static type systems as a design and communication tool). But I sometimes think it has too many features and they don't necessarily hang together as a cohesive whole.

I've spent much less time learning about Clojure, but from the get-go Clojure has a stronger authorial voice: there is a sense that the language has an opinion about what a program in it is supposed to look like. I can feel a driving force behind it keeping it focused and clean.

Scala feels more experimental and permissive. This isn't a bad thing from my inner PL enthusiast's point of view, but my inner software developer longs for the more rigid boundaries of an authorial design space.

I'm reminded of the dichotomy between Perl and Python: TMTOWTDI vs. "one obvious way to do it".

Whether there is really any need to "fix" this in Scala is bound to be a matter of taste and inclination. Even as it is, I greatly appreciate the contributions it has made and can continue to make to the PL world.

For me, the difference is more about whether I end up being a genuine Scala practitioner or not.

Scala is very complicated

Scala is very complicated and its feature set is not orthogonal. For example:

* Abstract type members and type parameters
* Pattern matching and method dispatch
* Mixin composition and normal inheritance

I think a better language is possible that unifies some of these things. That said, Scala is very good already.

Does anyone else feel uneasy about implicit parameters? I don't like the way program execution becomes dependent types.

Responding to: Scala is very complicated

I don't want to feed the flames, but I just can't let this go unanswered. You write

Scala is very complicated and its feature set is not orthogonal. For example

* Abstract type members and type parameters

That's downright funny. What's your definition of ``orthogonal''? I believe we can all agree there are two fundamental methods of abstraction: parameterization and abstract members. Now, compare with Java or a number of other mainstream languages: You can parameterize with non-function values but not with functions. On the other hand, you can have abstract methods, but no method parameters. For types, it's again parameterization but no member abstraction. In Scala you can use either abstraction mechanism with every sort of term or type. How is that not orthogonal?

* Pattern matching and method dispatch

By the same token you should damn Haskell and ML for having both pattern matching and higher-order functions.


Mixin composition and normal inheritance

They are fundamentally the same in Scala; classes are special cases of traits which are translated into Java classes. There are no two forms of inheritance; everything is explained in terms of mixin composition.

Does anyone else feel uneasy about implicit parameters? I don't
like the way program execution becomes dependent types.

Then by the same token you need to reject Haskell's type classes. Maybe you do, that's OK.

I guess I let myself already be dragged too far into this debate. Obviously people have different tastes and discussing those is futile.

Generally, I am not happy about how Scala and Clojure are often pitted against each other. There's no reason for this. They are both fine languages. Scala occupies a space between Haskell and Java in roughly the same sense as Clojure sits between Scheme and Java. That is, they are both more pragmatic and less pure than their more academic siblings, and each also has something new to offer. As such they can learn a lot from each other.

On the other hand, in my twenty years of involvement with the FP community I have never known someone who converted from Scheme to Haskell or the other way. And the FP community was and is totally OK with that. So let's continue to respect each other, even if we branch out to new platforms. We all have a lot of work to do convincing Java programmers to take a bet on FP. This will be hard enough, even without the distractions of infighting.

I converted

... from Scheme to Haskell.

And I did so just to ruin your "In all my years I've never!" argument. ^_^

I did the first third of my education in Java, second third in Scheme, last third in Haskell (plus a taste of Obj). I would have liked to get some experience with Coq.

Clojure and Scala are pitted against one another because they are competing for mind-share, especially in the parallelism space. Scala has its Actors model, and Swarm... Clojure has its STM.

Unlike Scheme and Haskell, Clojure and Scala don't offer the impression of being 'lab-toy academic languages'. That isn't to say you can't do serious stuff in Haskell or Scheme, but there certainly is a reputation to fight.

haskell+java=?

i wish CAL had more traction. (for the record, i went from adoring Scheme, to SML, and I very much appreciate Haskell. and yet now i am mostly trying to learn Clojure and experiment with Bigloo. and my day-job is Java (owww). i do not believe all programming language users are stuck in one camp or another, and would hazard to guess that the 'best' are not so narrow minded: right tool for the job and all that (and sometimes the job is just "learning"). this is an argument in support of "why can't we all get along" but different than what i felt Dr. Odersky was saying.)

In general, who can object

In general, who can object to that? But that the 'best' are often hard to find. Only recently someone who might be considered to fall into this category asked me which language I used to implement a simulation and when I said Scheme asked me "what's that?". Next time my answer is going to be "Scheme is like Clojure". That's what I call progress.

Clearing the air, I hope

To be very clear, I really didn't mean to pit Scala and Clojure against one another, and I'm frustrated by the way this thread has gone. I do believe that the two languages are products of different design approaches, and I do think that comparing them can be instructive for that reason. But I do not think that one is therefore necessarily better, or that there must eventually be a single winner. And I don't think I ever suggested that. If you feel that I'm attacking your work, I apologize. I like Scala and I have a lot of admiration for it. I have reservations about Scala, but it's my personality to have reservations about virtually everything. And, full disclosure: I've been using Scala in production for some years, while I haven't written more than ten lines of Clojure. But from what I've seen, I like Clojure as well.

Anyway, I've replied privately to answer your original question about my thoughts on Scala. I'm not sure LtU is the best place for it, and I don't want to give the impression that I'm "bashing" Scala.

And finally, regarding Scheme and Haskell, I don't know about conversion experiences, but Oleg seems to move quite comfortably between the two. Maybe he's an outlier, but we can all aspire...

Actually I find this thread

Actually I find this thread very interesting (even if a bit too adversarial). Only problem is that it's more about the design decisions in Scala than those in Clojure (hint, hint)... But those are just as interesting as far as I am concerned, so do carry on ;-).

STM worth being excited about

t's more about the design decisions in Scala than those in Clojure (hint, hint)...

To follow that other line...

As a long-time STM skeptic, I have to say that STM as envisioned in Clojure strikes me as exactly on track: focused on localized, stateful containers of immutable values being mutated with commutative operations.

It fits in the CTM hierarchy, it emphasizes reducing statefulness as the first line of defense against concurrency madness, but acknowledges that sometimes you need transactional state...

Nice stuff!

How is that not

How is that not orthogonal?

That is symmetric, not orthogonal. Orthogonal means that language features don't overlap. Parametrization and abstract member do in fact overlap. I agree that you should go for a symmetric design if you're going to provide parametrization and abstract members, but (1) are both forms worth the complexity cost? (2) is there a way to provide a best of both worlds in a simpler way?

BTW the design currently isn't completely symmetric. For methods you only provide parametrization no abstract member like abstraction (does it make any sense for methods?).

By the same token you should damn Haskell and ML for having both pattern matching and higher-order functions.

I do not. The overlap between higher order functions and pattern matching is much smaller in practice. It's clear when you should use higher order functions and when you should use pattern matching. But with dispatch+pattern matching you have to make a choice (and both choices have pros and cons):

class X { def foo() ... }
class Y { def foo() ... }
class Z { def foo() ... }

vs

def foo(a):
  case a of
     X -> ...
     Y -> ...
     Z -> ...

You might say that the first one is extensible in a different way than the second one. But if we add predicate dispatch with patterns that can act as predicates:

def foo(X a) ...
def foo(Y a) ...
def foo(Z a) ...

Complete object system is just two constructs (rough sketch):

class Foo(field1, ..., fieldN)

This defines a constructor and a way to pattern match on the objects.

def func(pattern1, ..., patternN) = body
def func(pattern1, ..., patternN) when predicate = body

Patterns would be extensible like extractors in Scala.

Now we have a very powerful, simple and unified design. You don't have to make a choice between pattern matching and extensible dispatch, we have best of both worlds. This is very expressive because the predicates themselves can be extensible methods. I don't know how to add a type system to this however. Do you have ideas?

They are fundamentally the same in Scala; classes are special cases of traits which are translated into Java classes. There are no two forms of inheritance; everything is explained in terms of mixin composition.

So why did you still choose to provide classes instead of just one construct?

Then by the same token you need to reject Haskell's type classes. Maybe you do, that's OK.

I do not reject type classes or implicit parameters, I just don't view them as a completely satisfying solution. And I'd like to know if anyone else thinks the same, and if they have an alternative.

Scala occupies a space between Haskell and Java in roughly the same sense as Clojure sits between Scheme and Java. That is, they are both more pragmatic and less pure than their more academic siblings, and each also has something new to offer. As such they can learn a lot from each other.

Exactly :) There are a lot of good ideas in both languages, and some good ideas (independently?) incorporated in both languages. A small example: array/map indexing with function call syntax. arr(i) and (arr i). You can use this in other places too: (map hashtbl xs). This is just a small simplification, but small simplifications add up.

On the other hand, in my twenty years of involvement with the FP community I have never known someone who converted from Scheme to Haskell or the other way. And the FP community was and is totally OK with that. So let's continue to respect each other, even if we branch out to new platforms. We all have a lot of work to do convincing Java programmers to take a bet on FP. This will be hard enough, even without the distractions of infighting.

Is the goal to convert Java programmers to FP? The goal should be creating better languages, and open discussion helps with that. I'm trying to raise questions and reevaluate design choices, not attack Scala.

RE: Predicate Dispatching

Now we have a very powerful, simple and unified design. You don't have to make a choice between pattern matching and extensible dispatch, we have best of both worlds. This is very expressive because the predicates themselves can be extensible methods.

The approach you just described is too powerful, too expressive.

Predicate dispatching has failed to achieve wide adoption for many reasons:

  • you cannot in general prove that two predicates do not overlap (i.e. fall into different equivalence classes)
  • where two predicates do overlap, a human must decide a precedence
  • you cannot in general prove that one predicate is a specialization of another
  • you cannot in general prove that a group of predicates cover all elements of a particular set
  • achieving performance can prove very difficult.

These make typing difficult, of course, but many aficionados of dynamic languages might (reasonably) wave that away.

More problematic in practice is what these issues mean for modularity. One cannot, automatically compose two or more client libraries containing extensions for a given function - at least not in a clean and well-defined manner.

Programmers will almost certainly need to determine precedence interleave for the functions for the two different modules, supposing one isn't just assigning a precedence number to each function. Open-systems runtime composition (i.e. abstract factories provided by runtime plugins, distributed objects, etc.) is pretty much impossible, whereas the Object-based design can do so quite readily.

A combination of heuristic efforts by both the compiler and programmer will likely be 'good enough' in achieving an acceptable dispatch precedence. I don't want to dis predicate dispatching entirely - its certainly nice for rules-based programming, and it is still more extensible than many designs. But to argue this a "simple, and unified" design strikes me as optimistic.

In Logic programming that sort of dispatch can be very powerful. The problem of figuring out 'which' function-body to resolve is greatly simplified in logic programming: you simply evaluate ALL of the bodies, and allow further mismatches to filter them. The absence of side-effects saves you quite a few troubles.

A Simple Precedency Policy

I know that languages like Cecil that have predicate dispatch use implication between predicates to determine the precedence. But is this really necessary? Just take the lexical method order with last method defined has the highest precedence (so the reverse order of normal pattern matching). This rule is simple thus easy to learn and provides the right behavior in all the cases I've seen.

For example:

def Mammal?(Animal a) = false // by default animals are not mammals
def Mammal?(Rabbit r) = true
def Mammal?(Lion l) = true

Now we can use Mammal:

def foo(Mammal m) = ...

And provide a specialized version for Rabbits:

def foo(Rabbit r) = ...

Note that we couldn't do it the other way around. Suppose that we have a library that provides foo on Rabbits and we want to make the method work on Mammals. We can define a new method:

def myfoo(Mammal r) = ...
def myfoo(Rabbit x) = foo(x)

Or if we provide a check if some value is in the domain (like with Scala's partial functions):

def myfoo(Mammal r) = ...
def myfoo(x) where x in domain of foo = foo(x)

If this is common enough we could have a new keyword that adds a method with lowest precedence, like this:

def default foo(Mammal x) = ...

Maybe it isn't necessary. In any case it should be considered bad style to add a more general case to an existing method.

Can you think of any examples where this precedence policy is wrong?

achieving performance can prove very difficult.

Possibly, but maybe not if you know all methods at compile time (or link-time or run-time if you are a JIT). What compilers that optimize dispatch do is replace method lookups in tables with series of if-statements because modern processors like that much better than unpredictable control flow. With this system you naturally have this form already. If you reorder the if-statements with run-time information on which cases occur most frequently I believe you can have very good performance. Getting good performance for this is probably easier than getting good performance in a language that allows runtime method definition (e.g. Java, Smalltalk).

Programmers will almost certainly need to determine precedence interleave for the functions for the two different modules, supposing one isn't just assigning a precedence number to each function.

Can you give an example where the precedence strategy I mentioned above would be wrong? In practice two separate modules will not even have overlapping predicates, no?

Open-systems runtime composition (i.e. abstract factories provided by runtime plugins, distributed objects, etc.) is pretty much impossible, whereas the Object-based design can do so quite readily.

Yes, I agree. It is a trade-off that's interesting to explore. In any case you can still send immutable messages over channels. And you can have distributed objects if you're willing to be explicit about which operations are distributed. You can think of this as an extension of functional languages ("extensible pattern matching") not as an extension of OOP dispatch. Then this isn't a lose situation because you didn't have distributed objects in your functional language ;)

Eliminating Modularity

In practice two separate modules will not even have overlapping predicates, no?

That sounds mightily optimistic.

You may wish to look into Open Functions and Datatypes

Thanks. Here's a working

Thanks. Here's a working link for other people who'd like to read that paper: http://people.cs.uu.nl/andres/OpenDatatypes.pdf

That sounds mightily optimistic.

Maybe, maybe not. In any case this system solves the (untyped) expression problem described in that paper. Anything that can be done with a class hierarchy is readily translated into predicate dispatch with this precedence rule, because in a class hierarchy (in most languages at least) the superclasses are defined before the subclasses. If you keep the methods in that order then dispatch will work fine.

class Num(n)
class Plus(a,b)

def eval(Num(n)) = n
def eval(Plus(a,b)) = eval(a) + eval(b)

Add a new data type:

class Minus(a,b)

def eval(Minus(a,b)) = eval(a) - eval(b)

Add a new operation:

def tostring(Num(n)) = tostring(n)
def tostring(Plus(a,b)) = tostring(a) ^ "+" ^ tostring(b)

The patterns in this example don't overlap so it can't possibly go wrong with any sane predicate dispatch precedence rule. Maybe you were thinking of another example?

Just wondering...

What kind of code do you write for a living that requires this kind of complexity? (I'm not making a value judgment on its complexity, only that I have never used a complex solution such as the one you describe).

Complex Domain-Object Code

What kind of code do you write for a living that requires this kind of complexity?

While I do not do it as a living, I see code of similar or greater complexity when working with Inform programs... i.e. where the 'objects' in question are elements of a simulation, and you need to tweak how each Qualified Subject Verbs Object Preposition Object. I.e. how do Rabid Chinchillas Drive Cars Over Ice? Should you have a special rule or exception for that case?

Peruse some code at the Inform site and you'll see code that gets hairy, and a language designed for handling it. I think there is a lot to learn from that language.

Right....

I expected the answer you gave as the only valid answer... a different answer would've surprised me.

Now my follow-up question... How is this better than a rule-based system?

How do you maintain and debug "complex domain-object code"? At least for the stuff mentioned in the Prolog papers (i.e., "The Practice of PROLOG") there is a big emphasis on the system to be self-describing in addition to supporting advanced dispatch.

Rules Based + Domain Objects

Now my follow-up question... How is this better than a rule-based system?

Rules-based systems are quite orthogonal to "complex domain-object code". That is, rules are a very poor mechanism for representing objects, and objects are a very poor mechanism for integrating rules (at least in a high-performance manner). Inform language is also a good example of this combination is also rules-based (that is, you can express things like "whenever Event do Action" that will trigger in the future without appearing in the normal control-flow of another action).

That said, it is only useful to combine the two in a simulation or game. After all, a rules-based system over external data simply doesn't need the concept of creating "new" objects ('objects' are just a pattern-matching entity-view imposed on data for the sake of expressing complex rules). And object-oriented programs (under the Nygaard classification) are supposed to be constructed of black-box components so there should be little reason to create rules that peek inside.

Only if one is writing up a game or simulation - that is, where one is creating data (perhaps driven by input from external resources) - does it make excellent sense to combine the two. But, in those situations where it does make sense, it is a powerful combination... a fine basis, I think, for the sort of work Tim Sweeney does.

Note that I do not call working with "complex domain-object code" by the name "object-oriented programming". The two concepts are distinct methodologies (though I'm sure many at comp.object would disagree).

How do you maintain and debug "complex domain-object code"?

Unit-tests. Type analysis. (Inform is statically typed, and comes with a powerful set of test utilities.)

While not available in practice, one could (theoretically) also request certain analyses for invariants... i.e. a nice property for gaming is if the game itself knows when precisely you've lost (got stuck in a losing position), as opposed to leaving you wandering around for hours wondering what you missed.

Note that I do not call

Note that I do not call working with "complex domain-object code" by the name "object-oriented programming". The two concepts are distinct methodologies (though I'm sure many at comp.object would disagree).

I don't think anyone reads (as evidenced by posting to) comp.object regularly except for maybe Patrick May, Stefan Ram, H.S. Lahman and myself.

Taking your theory about disagreement to task:

Inform language is also a good example of this combination is also rules-based (that is, you can express things like "whenever Event do Action" that will trigger in the future without appearing in the normal control-flow of another action).

Why can't I represent that using objects? It might be poorly performant message-passing solution, but it is representable and definitely maintainable -- you have a collaboration diagram, and in "UML as a programming language"-style MDA you strictly DON'T care about considering control-flow WITH synchronization. Instead, you decouple control-flow from synchronization. Thus, we can still adhere to Nygaard's definition, by making the system modular along problem domain entities' responsibilities.

What Inform seems to offer is an excellent way to rapidly prototype this in an ad-hoc fashion. This is enough for me to know I am interested in looking at Inform, and see how well it allows rapid prototyping... as I am a huge fan of rapid prototypes.

Only if one is writing up a game or simulation - that is, where one is creating data (perhaps driven by input from external resources) - does it make excellent sense to combine the two.

Hmm... could you flesh this out a bit more for me? Why creating data? Are you talking about object graph management, or what?

Rules-based and Domain Objects

Why can't I represent that using objects? It might be poorly performant message-passing solution, but it is representable and definitely maintainable -- you have a collaboration diagram, and in "UML as a programming language"-style MDA you strictly DON'T care about considering control-flow WITH synchronization.

Attempting to represent whenever Event do Action as an object is possible, just not practical. Don't start an argument based on whether you can merely represent something... not unless you're prepared to swim in a Turing tarpit.

The reason representing rules as objects is difficult is that 'Event' can be both arbitrary and have abstracted parameters. Support for abstracted parameters, especially, is critical. Consider: whenever an Animal wearing an electric Collar crosses a Line that is recognized by the Collar, the Animal is zapped by the Collar. The concrete instances of the animal, line, and collar involved are selected by the rule variables and pattern matching... they are not known in advance.

While you could 'represent' the Event for this rule as an object graph, it is non-trivial to do so... often one will end up compiling the trigger-expression into the object graph representing the complex trigger, at which point you're essentially writing a new language to make up for the deficiencies of OO. Further, you would need to hook the graph into the right registry, and the registry would need to handle some very arbitrary types (which in most OO languages would require massive reflection and downcasting from 'Object'). Further, you would need to ensure that all new representations of animals, collars, lines, and relationships (such as whether a collar recognizes a line) are also submitted to the same registry. Further, you would need to expose all the second-class OO observer-pattern vocabularies for each object so that you can perform the subscriptions (i.e. to get low-latency feedback when an animal crosses a line). Still further, you would need to make all this works with garbage-collection (i.e. avoiding strong cyclic references between registry, rules, subscriptions, and domain objects). This is very much an uphill battle! I think it reasonable to say that rules-based programming simply is going to be extremely rare in any language that fails to offer programmers better than OOP objects for representing the rules and domain elements.

It is useful to support first-class rules - things you can create, destroy, encapsulate, and modularize over the lifetime of a program. Inform achieves this via support for 'rulebooks'. A rulebook encapsulates a collection of rules which may be manipulated and applied to the environment dynamically. Use of objects to represent rules can, at least, offer a few of these features. But it's still an uphill battle to actually integrate those rules, so one should also have first-class support for the integration in rules-based programming.

[As an aside related to the OP: I very much dislike Clojure's use of 'when' as pretty much equivalent to an 'if'-expression. IMO (when X Y) should introduce into the environment a one-time-activation rule that will wait until X becomes true, then activate Y. Whether or not X is true, control flow semantics should immediately move past the 'when' statement.]

[Edit 2: Perhaps you were thinking about a similar concept - when Event do Action in reactive programming, where 'Event' represents a connection to a concrete signal resource. If so, then you were simply forgetting that the context of discussion was rules-based programming. Rules, of any non-trivial sort, have abstractions... i.e. the difference between "whenever Z-Bo does XYZ" vs. "whenever any LtU member does XYZ". These abstractions make for an extra challenges: the lifetime of the rule is now decoupled from the lifetime of at least some of the objects to which it applies, and the performance scaling issues are much more significant, and so on.]

Instead, you decouple control-flow from synchronization.

This isn't a synchronization issue (though there are plenty of semantics issues for concurrency of rules-based programs, thus all the 'temporal step' designs and such).

It's a control-flow issue because object control flow is based upon reaction to targeted messages (even if provided through a pubsub service), whereas rules-based control-flow is based upon reacting to arbitrary abstract spatial-temporal patterns in a larger environment, and messages in that sense are just temporally discrete observations.

Complex domain-objects in Inform don't react to messages. They're simply elements in an environment.

Why creating data?

By 'creating data' I mean, for example, saying that "there is a table" or "the plate is now broken". Data either comes from an external resource and analysis thereof... or you 'create' it locally, as a sort of lie, plan, or other imaginative event.

Creating data is quite suitable to games and simulations and other imaginative pursuits, but not very suitable to interacting with 'real' systems. Working with a real system uses external data resources produced by observation or analysis thereof. To some degree you might need to make predictions where you cannot make observations, but ideally all observations are on an open loop - if there is any feedback from an action, it occurs through the future observations.

Rules-based programming is well suited to analyzing and reacting to data whether it be created or observed. But OOP objects - which, at least under the Nygaard classification, represent components of a 'machine' describing the inner workings of an application or program - do not represent 'data' of any sort. In the absence of data, rules-based programming isn't especially useful for interacting with OOP objects. (I suppose you could use rules as an orthogonal basis for aspect-oriented programming, though that approach eventually hits a security wall and fails to scale to open systems.)

In the open loop, one isn't directly creating 'new' data to which to react. Thus the explicit support for 'complex domain-objects' is largely eliminated. When working with a simulation or game or imaginative pursuit, however, that loop is at least partially closed. The programmer must be able to say (as the result of a player waving a wand or swinging a hammer, for example) that there is now a table... or that the plate is now broken.

To express the concepts of 'table' and 'plate' and similarities between them (i.e. that all coffee tables are tables, and that both are surfaces atop which items may be placed), or the destruction of a plate and the creation of plate-fragments, may utilize something superficially similar to OOP (in that there are constructors, destructors, and inheritance), yet not at all related to the Nygaard classification or the discipline or methodologies of OOP.

Thus, I distinguish the two.

"Domain objects" are not OOP objects - not parts of a virtual data-crunching message-passing machine - but rather exist as a consequence of finding useful patterns in any body of data. You can view a mass of water vapor particles above a certain altitude as a 'cloud', or a large land-locked mass of liquid water touching earth as a 'lake', or an "ugly bag of mostly water" as a 'human' - just a way of looking at things. You recognize domain objects as abstract patterns in space, and verbs as abstract patterns in time, and rules-based programming would be used to describe these patterns and associate with them appropriate actions upon observing them.

I don't code for a living.

Disclaimer: I don't code for a living. Where I see the additional power being useful is for example in compilers, data structures, mathematical software and web applications. Tidbit: Hobo plugin for Rails provides predicate dispatch.

Maybe you're right, projects that I do get paid to do are relatively trivial.

I'm curious what you find complex about this though? Is it the predicate dispatch system or the particular animals example? Think of what I described as a bare bones system (like Scheme with only define set! and lambda). In a realistic language you'd provide some syntactic sugar on top of it. For example this syntax:

class Expr = Num(n) | Plus(a,b) | Minus(a,b)

Could expand to:

class Num(n)
class Plus(a,b)
class Minus(a,b)

def Expr?(x) = false
def Expr?(Num(n)) = true
def Expr?(Plus(a,b)) = true
def Expr?(Minus(a,b)) = true

And you'd write the animals example like this:

class Mammal = Rabbit | Lion

This simplifies the example but it complicates the system by introducing a new construct: special syntax for algebraic data types

Not the point

Maybe you're right, projects that I do get paid to do are relatively trivial.

The spirit of my question was more in the tone of, "As an engineer, my preferred way of solving a problem is to avoid it. Show me what problems you face that you cannot avoid using this technology for." In other words, how does predicate dispatch research get funding?

I prefer simple, maintainable solutions to complex, ad-hoc ones.

In other words, how does

In other words, how does predicate dispatch research get funding?

I wouldn't know the answer to any question of the form "How does X research get funding?". But it is not very relevant for me (I'm not a computer scientist). My goal is to create a better language not get funding ;) What I tried to show here is that predicate dispatch doesn't need to be as complex as the form of predicate dispatch in the literature.

Let me ask you a few questions:

- Do you think that pattern matching like ML is good?
- Do you think that OO dispatch like Java is good?

If the answer is yes to both questions then predicate dispatch may be for you. It unifies both concepts into one. Predicate isn't simpler than no predicate dispatch. Predicate dispatch is a simpler alternative to OO dispatch if you already decided you want pattern matching in your language. You reuse the concept of pattern matching for method dispatch instead of introducing a new separate concept of dispatch.

Show me what problems you face that you cannot avoid using this technology for

There are none. You can solve any problem with if statements. It just won't be as nice. An example is a parser. You can use pattern matching on streams or if statements to write a parser but then your parser isn't extensible. If you write the pattern match with predicate dispatch you can add extra cases. This way you can extend the language your parser parses.

Another example is code transformation in compilers. For example http://www.cs.indiana.edu/~dyb/pubs/nano-jfp.pdf. This would work very well with predicate dispatch.

Another example is web applications. In Hobo you can define template elements that use predicate dispatch. This gives you a rule based template language. A common simple pattern is that you define tags that displays something if the user has/doesn't have a certain permission to perform an action.

On being COMPLEX

I'm curious what you find complex about this though?

I deal with complex systems on a daily basis, but I intuitively know what you are proposing is more complex. Trying to dig up reasons why, I will reflect back to you your example :

def Expr?(x) = false
def Expr?(Num(n)) = true
def Expr?(Plus(a,b)) = true
def Expr?(Minus(a,b)) = true

I do not understand why

def Expr?(x) = false

comes first. It looks like the wildcard or default case, and should come last.

Moreover, to modify the problem domain I have to examine many potential cut points. This is not the case with OO when doing proper OOAnalysis; a class symbolizes a problem domain element, not a controller.

I do not understand why def

I do not understand why
def Expr?(x) = false
comes first.

You could change the system so that it comes last. I just adhered to the OOP ordering of defining superclass methods first subclass methods last. I think that this is the more intuitive ordering if you don't think of this as pattern matching and instead think of this as more powerful OO dispatch.

Moreover, to modify the problem domain I have to examine many potential cut points. This is not the case with OO when doing proper OOAnalysis; a class symbolizes a problem domain element, not a controller.

There are two cases to examine:

1) You can do a good OO design of the problem. Now you can just transliterate the solution to predicate dispatch and it'll have the same nice properties as your OO solution.
2) The if-statement solution is not extensible and OO solution is overly complex (for example because you have to introduce a lot of classes to get the behavior of the if statements). In this case there is no nice OO solution and your predicate solution will maybe require careful analysis when reading the code. The same is true of the solution if the OO language.

This is a common argument. If someone proposes to change a feature X to a more general feature Y people will say that you can write unreadable programs with feature Y. They forget that if you wrote the problem with feature X it would be even more unreadable.

I deal with complex systems on a daily basis, but I intuitively know what you are proposing is more complex.

What exactly do you find complex? The semantics of this feature? The semantics are very simple (even simpler than OO dispatch) so that probably isn't it. Are you afraid that people will code up complex programs with this feature?

Orthogonal

How is that not orthogonal?

That is symmetric, not orthogonal.

No, that's orthogonal. You can use any sort of abstraction for any sort of member, just like in an orthogonal instruction set you can use any instruction with any addressing mode.

The goal should be creating better languages, and open discussion helps with that. I'm trying to raise questions and reevaluate design choices, not attack Scala.

I am all for evaluating design choices in a detached and systematic way. If you in fact tried to do that in your previous mails, fine.

Different areas of computer

Different areas of computer science use different meanings. This is the meaning I'm using in that sentence: http://c2.com/cgi/wiki?NonOrthogonalLanguageFeatures

I am all for evaluating design choices in a detached and systematic way. If you in fact tried to do that in your previous mails, fine.

You asked for criticism which is good but when you get criticism you become defensive. You ask what I mean by orthogonal but when I explain what I mean by orthogonal you respond with "No, that's orthogonal." instead of addressing the problem. From the other points on my list you draw conclusions about what I my (supposedly wrong) opinions are about other languages instead of addressing the concern. This could have been an interesting discussion about the design choices behind Scala and an opportunity for you to explain and for me to learn about them, but unfortunately you turned the discussion away from discussing the design choices to discussing the opinions.

Scheme programmer becoming fond of Haskell

On the other hand, in my twenty years of involvement with the FP community I have never known someone who converted from Scheme to Haskell or the other way.

I have been programming in Scheme for years, and I will probably use Haskell for my next hobby project. I originally wrote my "fun" projects in Scheme, because it was really easy to learn (I probably became proficient in it in a day), and because I had prior knowledge of Emacs Lisp and other Lisp dialects. (so I think that the transition was obvious) But now, I want features like static typing, (I annotate the so called types of all of my functions in scheme in comments) by "default" immutable data structures, and I have been growing envious of various features in Haskell, Erlang and ML for a while... For the most part, I was simply too lazy to use those languages more often until recently.

There are several reasons why I'm more interested in Haskell than one of the MLs, but to be concise I find that the community is larger, and I think that the language is unfairly accused of being unsuitable for "real" programs. (I always hear this accusation made of Scheme as well)

Transition-wise, I think that when Standard ML came out, a lot of Scheme programmers switched to that, and when Haskell came out, the SML/OCaml programmers may have switched again. So I don't see any of the languages being fundamentally different, although the cultures of "dynamic" and "static" languages seem to attract or repel people. (In my case, I'm not repelled by either)

I don't really see it as an all-or-nothing proposition. All of my stuff written in Scheme will probably stay written in Scheme, (eg. scripts) and my stuff written in Haskell will probably remain in that language.

That said, I'd probably choose Scala over Clojure, because Clojure would have additional hurdles to deal with in a software company:
-skepticism of the syntax
-not statically typed

I think that both are fine functional languages for the JVM.

Generally, I am not happy about how Scala and Clojure are often pitted against each other. There's no reason for this.

I think that the reason why both languages are so often compared is because both are used in the same niche, which is promoting FP in conjunction with OOP, with java as a plaform. (eg. promoting FP techniques towards commercial software companies) Keep in mind that I still agree with your assessment that Clojure is primarily targeted towards Lisp or Scheme programmers, and Scala towards ML or Haskell programmers.

"the implementation is good"

I think taking the fast track to a solid implementation is the key. Clojure--and Factor--have already outshone quite a few other languages that showed promise but stagnated and never rose above the level of interesting and experimental. Nicely done!

choosing battles

not having to spend time on static type checking i guess means you can spend that elsewhere?

Long live diversity!

There is an undercurrent in some of this thread (and elsewhere) that seems to believe that languages for the JVM are destined to play out a PL version of Highlander: "There can only be one"!

Personally, I find it more exciting that a single platform is spawning such radically different language designs, and that they can share an underlying body of established libraries and add-ons so that the choice between them is not so dependent on the dreaded "network effect" and the accidents of history.

No PL is or can be the "one perfect language", so I prefer a world where there are many healthy options to choose based on circumstance, judgment and taste.

Rich and Martin, thanks for your contributions to this world!

One perfect language

Of course you're right that there isn't a best language for all problems, but it does not follow from this observation that we're better off using multiple languages for the problems we encounter.

If we consider a programmer using multiple languages as working in one union language where the file extension of his source determines which language constructs are in scope, then this meta-language is probably poorly factored with lots of ad hoc differences in syntax and semantics that only serve to block interoperability between sub-languages. We'd be better off factoring the important semantic differences into libraries and having one language with a common core.

One Extensible Language

We'd be better off factoring the important semantic differences into libraries and having one language with a common core.

The important syntactic differences should be factored off into libraries, too!

I'd like to see more languages providing some powerful syntactic extensibility from within the language (e.g. via PEG). I'd also like some greater support for context-sensitive grammars - perhaps via the attribute grammar mechanisms. The latter can be nice in small doses, for reducing explicit repetition of expression.

One language at a time

Of course you're right that there isn't a best language for all problems, but it does not follow from this observation that we're better off using multiple languages for the problems we encounter.

Just because I claim that a diversity of languages in the programming world or on the JVM is good, it doesn't follow that I'm advocating using a diversity of languages by the same person at the same time on the same project.