## Clojure's Approach to Identity and State

Clojure has been discussed here before. It fits in the Lisp family with S-expressions, macros, and functions as values. Like most Lisps, it's dynamically typed and impure. But its target focus is concurrency so symbols are defined immutably by default; its standard library's collection structures are immutable and persistent; and its various mutable concepts are threadsafe except, of course, for the back doors implied by I/O and JVM library interoperability. See vars, refs, and agents.

What I wanted to highlight is position paper of sorts that Rich Hickey has posted on Clojure's Approach to Identity and State. An excerpt:

While some programs are merely large functions, e.g. compilers or theorem provers, many others are not - they are more like working models, and as such need to support what I'll refer to in this discussion as identity. By identity I mean a stable logical entity associated with a series of different values over time. Models need identity for the same reasons humans need identity - to represent the world.
...
In Clojure's model, value calculation is purely functional. Values never change. New values are functions of old, not mutations. But logical identity is well supported, via atomic references to values (Refs and Agents). Changes to references are controlled/coordinated by the system - i.e. cooperation is not optional and not manual.

There are other ways to model identity and state, one of the more popular of which is the message-passing actor model, best exemplified by the quite impressive Erlang. ... It is important to understand that the actor model was designed to address the problems of distributed programs. And the problems of distributed programs are much harder ... Clojure may eventually support the actor model for distributed programming, paying the price only when distribution is required, but I think it is quite cumbersome for same-process programming. YMMV of course.

The essay is worth a read on a couple of levels of interest to LtU. At an abstract level, it's a good example of a well-articulated design justification. Agree or not, it's clear that Hickey gave thought to his decisions. Too many language designers fall into the trap of blindly inheriting semantics from a favorite language and end up putting new lipstick on the same pig. Any language designer would do well to write an essay or two like this before jumping into their venture.

At the specific level, the core approach is certainly worthy of discussion and alternative designs. Is mutable state really the best way to deal with "working models"? Are there things that the pure functional camp can do to make "working models" more convenient, e.g. do Clean's uniqueness types fit the bill at least for sequential programming, or are they too restrictive? Are there things that can make Erlang style actors less cumbersome to use especially when distribution isn't necessary?

## Comment viewing options

### Arity pattern matching for &opt and &rest args - Interesting

Perusing the Clojure docs, I noticed that Clojure seems to use a pattern matching like mechanism on the arity of a function to implement traditional Lisp style opt and rest arguments.

While a bit more verbose than CL style lambda lists, I must say that I'm intrigued with this approach and might consider it for my own "toy" language.

Preliminarily, I imagine that the opportunities for convenient compile time analysis and inlining are greater with the Clojure style "arity pattern matching" style function definitions than with CL style opt/rest lambda lists, with the Clojure style definitions also being more expressive in the general case.

Are there any other languages that pattern match on function arity or any papers that elaborate, weigh the pros/cons, etc. of this function definition style?

Scott

### My language Kogut does that

Example:

def Min [
()        {Inf}
x         {x}
x! y!     {}   // dispatched on argument types
x y zs... {Min (Min x y) zs...}
];

The cases donâ€™t have to only check the arity, they can use arbitrary pattern matching on the sequence of parameters. Brackets are optional if there is only one case. This works for anonymous functions too.

### Erlang is cumbersome?

Are there things that can make Erlang style actors less cumbersome to use ...

Erlang is cumbersome? I don't know much about the language but most of what I hear about it is quite positive. Any references to reading about difficulties with its style of actors?

### In the article

Hickey's specific complaints are in the article:

I chose not to use the Erlang-style actor model for same-process state management in Clojure for several reasons:

* It is a much more complex programming model, requiring 2-message conversations for the simplest data reads, and forcing the use of blocking message receives, which introduce the potential for deadlock. Programming for the failure modes of distribution means utilizing timeouts etc. It causes a bifurcation of the program protocols, some of which are represented by functions and others by the values of messages.

* It doesn't let you fully leverage the efficiencies of being in the same process. It is quite possible to efficiently directly share a large immutable data structure between threads, but the actor model forces intervening conversations and, potentially, copying. Reads and writes get serialized and block each other, etc.

* It reduces your flexibility in modeling - this is a world in which everyone sits in a windowless room and communicates only by mail. Programs are decomposed as piles of blocking switch statements. You can only handle messages you anticipated receiving. Coordinating activities involving multiple actors is very difficult. You can't observe anything without its cooperation/coordination - making ad-hoc reporting or analysis impossible, instead forcing every actor to participate in each protocol.

* It is often the case that taking something that works well locally and transparently distributing it doesn't work out - the conversation granularity is too chatty or the message payloads are too large or the failure modes change the optimal work partitioning, i.e. transparent distribution isn't transparent and the code has to change anyway.