Enso: William Cook's new programming model

William Cook is documenting a new graph based "programming model" on his blog. It seems quite early in development, probably too early to offer qualified comments, but the claimed properties are interesting:

Ensō is a theoretically sound and practical reformulation of the concepts of model-driven software development. Ensō is based on first-class structural descriptions, invertable transformations, generic operations and interpretation.

Structures in Ensō are a specialized kind of graph, whose nodes are either primitive data or collections of observable properties, whose values are either nodes or collections of nodes. (snip) The key point is that structures in Ensō are viewed holistically as graphs, not as individual values or traditional sums-and-products data structures.

A structural description, or schema, specifies some of the observable properties of structures. Schemas are used to check the consistency structures. Some properties can be checked while the structure is being created, but other can only be checked once the structure is complete. Ensō allows modification of structures, which is necessary to create cyclic graphs, but also allows valid structures to be sealed to prevent further changes.

Invertible transformations are used to map one structure into another kind of structure, such that mapping can be inverted to (partially) recover the original structure from the result. One common kind of transformation is called a grammar, which is an invertible transformation between structures and text. Text grammars are invertable because they can be used for parsing and also rendering

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Cyclic Graphs

Ensō allows modification of structures, which is necessary to create cyclic graphs

There are plenty of techniques that allow expressing cyclic relationships - letrec, logic variables, graph expression as a matrix or association list, et cetera. I'm curious as to what else motivated support for mutable structures.

Theory of descriptions

I think this means mutability is needed in the context of Enso's description-based structures.

Project blog

The link above is to Cook's own blog, but there is a new project blog for Enso itself.

Neat

Sven Apel's comments on that blog about feature-oriented software development confirmed my thoughts about using this for software factory / software product line applications.

Update

I posted another item to the Ensō blog. I will continue to post once or twice a week. I'm not sure people want to hear about ideas in progress, but we'll see how it goes. If you want to try out the code, let me know.

As for cycles and mutation, so far Ensō is a partly functional and partly imperative. But this is not a distinction that is driving our design. We find it convenient to build cyclic structures directly using mutation. The other techniques that were mentioned are more indirect. Have we gotten to a point in language design where pure functional is the default and we have to justify including imperative features? I like the idea of "locally functional and globally imperative". Himm, and acronym? LFGI. This works well for database-driven applications, and is also my view of Erlang.

Imperative cycles

Imperative can work well enough in-the-small, if well isolated (e.g. using Haskell's ST monad). You could build cyclic structures that way. That said, I would rather favor a more scalable and incremental model.

Imperative fails us at scale, when we introduce concurrency. Even 'globally imperative' is problematic, since it remains difficult to achieve global consistency and coordination.

Have we gotten to a point in language design where pure functional is the default and we have to justify including imperative features?

Huh? Shouldn't you have a cogent justification for every language design decision? Why make a decision at all, without justifying it? What's this about 'defaults'?

Locally Functional, Globally Imperative: FRelP?

...sounds just like Functional Relational Programming -- the "other" FRP! (I've been calling it FRelP to disambiguate)

Imperative

I don't see why you think that imperative works poorly at scale. This is what databases do. They are large-scale imperative. The transactions that make changes can be purely functional. This is how Erlang works, and I think it makes sense for database computations as well. If you look at my recent blog post on Ensō data you'll see that we are creating cycles declaratively in our high-level domain-specific languages. But under the covers the interpreters are currently imperative. Yes, I suppose we do have to justify all features eventually. But during a design exploration I think it is better to follow intuition, and let the solutions evolve as you struggle to solve chosen examples, rather than work from abstract principles (i.e. dogma).

Re: Transactions at scale

Transactions are a mediocre answer. Some points:

  1. Large transactions have high risk of disruption and rework. By nature, transactions must be limited to small-scale tasks.
  2. Transactions aid us by increasing the granularity of our commands. If we keep them small, we still face all the same concurrency issues once our commands cross multiple transactions.
  3. A compositional system requires distributed transactions, i.e. such that a client can invoke two services in a single transaction. Without this, we're forced to provide a monolithic service and database to serve all client requirements.
  4. If we use distributed transactions, it becomes difficult to control their scope. We can introduce transaction barriers (i.e. that delay messages until commit) but we can only control those in a small subset of the application.
  5. Transactions are not very robust in a security sense. I.e. they are very vulnerable to denial of service attacks. We can mitigate this with priority, but administrators for different services in a system are unlikely to agree with your priority requests.
  6. Transactions don't integrate well at the edges - sensors, actuators, HCI. Sensors often provide continuous data streams, so no clear transactional boundary exists. Actuators continuously observe, but cannot undo. Humans cannot be expected to perform much rework on account of a transaction failure and a change in the data; they really need a continuous command and control model that allows them to react to changes while they occur.

I did try transactions as a basis for one of my former language models, and the above points are a few of the the reasons I abandoned it. That said, transactions aren't all bad; transactions are still leagues better than hold-and-wait or locking protocols. A few of my thoughts on controlling transaction scope are at the bottom of that page.

This is what databases do. They are large-scale imperative.

The reason databases succeed is not due to their imperative nature. Indeed, we could do without the imperative updates and queries - e.g. via a monotonic temporal database, or a reactive one.

we do have to justify all features eventually. But during a design exploration I think it is better to follow intuition

Well, sure. I'm all for exploratory language design, so long as you don't confuse 'exploration' with 'decision'.

I'm not fond of intuition, though, since it is too easy to confuse with familiarity and varies too widely among developers to serve as a basis for reasoning. I much prefer creating user-stories and scenarios (or perhaps 'developer stories' for users of a programming language) and pseudo-code to help me elucidate my reasoning and sort of 'regression-test' any new language designs.

Quote

I liked the following observation by Erik Meijer, which I am not sure is attributable to him:

Transactions are compositional, locks are not.

Kind of sums up the whole problem with scalability of transactions.

Scalable models are compositional

Scalable models are compositional. Being compositional is effectively a requirement for scaling to multi-developer systems. I describe this more thoroughly in another thread. So it isn't very clear to me how being compositional "sums up the whole problem with scalability of transactions". Being compositional isn't the problem. Certainly, transactions are much more scalable than locks.

The fundamental issue of transaction scalability is that it lacks a physical foundation. The world isn't atomic, and so it fights back. This isn't so much an 'abstraction leak' as 'implementation details busting free'. Thus, a second requirement for scalable computation models: they must work with the world rather than against it.

Oh, that...

Well, clarifying, there's a 'tension' inherent in that statement. In the sense that you'll always perform transactions with/against an entity. And that entity -generalizing where one cannot- will usually lock some resources to safely perform it.

I.e., one wants compositionality - for the reasons you describe,- and transactions certainly provide that, but one layer down, where stuff becomes concrete, you're still locking and that makes it hard to scale.

[ Just a typical leaky abstraction. ]

Lock-based implementation not relevant

I think I see what you're saying: a lot of transaction systems are implemented using locks.

But I'm not convinced that's a relevant concern. Even if we implement optimistic (lock-free) transactions, the issues I present above apply. Nor would using locks to implement transactions resolve any of those issues - e.g. we still shouldn't start work with actuators or whatever until the transaction commits.

If one agrees with not

Thus, a second requirement for scalable computation models: they must work with the world rather than against it.

If one agrees with not seeing your statement as merely stating the obvious, there, then I suspect it's likely to be useful to consider that, a priori:

a) it does hold (i.e., you're likely right) and

b) it also likely comes with the other need for us to take into account the full extent of its implications w.r.t. managing the tension (or would that rather be, "distance"?) between:

i) the current state of affairs in designing more or less globally distributed systems and

ii) language design paradigm rationale- and choice-related topics.

There, I also suspect many points from Roy T. Fielding's thesis (*) in regard to the former might happen to be usefully leveraged, or reused, if only for a broader (or deeper?) perspective in regard to the latter. (E.g., the kind of "high level" ideas that crossed my own mind, FWIW.)

(*) Architectural Styles and the Design of Network-based Software Architectures
http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm

[EDIT] Btw, I do find Enso's ideas pretty interesting, there, a priori; I'm definitely willing to read/learn more about it in the future, as they develop and turn into more tangible materials.

A Language of Micro Databases

from the article: When you think of data structures as mini databases some interesting things can happen.

With this, I agree. We can generalize from 'collection-oriented' programming to work with models where the basic values are rich databases. Efforts along this line that have interest me are: extended set theory (from Dave Childs), modular set processing via Datalog-like combinators (which is exemplified well in Bloom), and even the possibility of using generative grammars to carry information.

Part of my interest in values as databases is the potential for incremental processing in an eventless reactive model... i.e. we can describe one database as a distributed query on others, then implement it by streaming 'diffs' to established clients. Mine is a very different vision for database connectivity than the common imperative/transactional approach seen today. I would like distributed live queries and bi-directional control via lenses or mutable views. Though, I still favor the relational model.

Imperative state manipulation of data structure also doesn't play nice with incremental computations in reactive systems. We really need persistent data structure, such that each change has the space-time computational overhead of the diff, yet the old structure of the previous step(s) may continue to be observed in parallel while we generate the new one.

Finally, some papers on Enso

We have updated the Ensō web site and added two papers:

As for some of the discussion here, I think that the issue with mutable state is how to manage and control its scope. This is not easy. The main use case that is problematic (but not necessarily impossible) for functional programming is making a small change to a huge (terabyte) relational database. In theory this requires copying the entire database. One technique that we know works is for the individual changes to be pure functional, while the overall state is managed with transactions. Erlang also has global state and pure functional transitions, but without the transactions. We are still exploring what to do in Ensō.

Excellent. Front page.

This is home page material.

Enso presentation

I gave a presentation on Enso at MSR last week.

Managed data is definitely a

Managed data is definitely a useful concept, even for existing languages. I just developed a small preview of managed data for .NET in case anyone's interested. I think it captures the essentials of what's described in the paper, although in a typed context (the paper uses Ruby).