John Shutt on "Abstractive Power"

John has written a long and promising blog post on abstractive power.

Another four decades of experience (since Standish's post-mortem on the extensible languages movement) suggests that all programming languages have their own envelopes of reachable extensions. However, some languages have much bigger, or smaller, envelopes than others. What factors determine the size, and shape, of the envelope? What, in particular, can a programming language designer do to maximize this envelope, and what are the implications of doing so?

As a useful metaphor, I call the breadth of a language's envelope its radius of abstraction. Why "abstraction"? Well, consider how the languages in this envelope are reached. Starting from the base language, you incrementally modify the language by using facilities provided within the language. That is, the new (extended) language is drawn out from the old (base) language, in which the new language had been latently present.

[..]

The Smoothness Conjecture [defined earlier in the post] is a neat expression of a design priority shared by a number of programming language designers; but, looking at programming language designs over the decades, clearly many designers either don't share the priority, or don't agree on how to pursue it.

What, though, if we could develop a mathematical framework for studying the abstractive power of programming languages — a theory of abstraction. One might then have an objective basis for discussing design principles such as the Smoothness Conjecture, that to date have always been largely a matter of taste.

I'll admit I haven't read it yet, but I'm very happy to see it and hope some of you may be as well -- I even dare to suspect that recent mentions of this topic here on LtU may have been one of the motivations for the large effort of John to write and release this document.

Now for a hefty amount of reading, and then some programming language discussion!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

A programming language needs a model of computation

A programming language needs to be mapped to model of computation. Otherwise, it is syntax without semantics.

Some models of computation, (e.g., functions that can be implemented in the lambda calculus) limit the abstractions that can be implemented.

No comprende

I still don't get the ideas here. I haven't ever worked through the details of Felleisen's result, but the idea of comparing expressiveness of two languages by whether or not it's possible to macro express one in the other is pretty simple. I have been unable to attain a comparable understanding of your abstractiveness from the blog and technical report.

For example, you say that abstractiveness is like a second derivative of semantics, but I have trouble even identifying Felleisen's expressiveness as a first derivative of semantics. A high first derivative of semantics would seem to me to mean that you change the program a little and the meaning changes a lot. That doesn't sound like a very useful property. Expressiveness is rather an ordering of languages, rather than a rate of anything. By the time we go to take a second derivative I'm completely lost.

And I don't like the example you've chosen as a test case, because it feels like something that fits perfectly well into Felleisen's expressiveness, save for a technicality that reporting error messages in certain cases isn't one of his observables. Can you give a test case that better conveys what abstractiveness is about?

In general, my intuition is that there are certain classes of languages that are maximally extensible, in a sense that can probably be made precise, but that involves only a reachability test and not a metric/distance. And that this will be a useful property for a language to have, but that there will be a large variance in language quality among languages with this extensibility property.

No metric/distance

A misunderstanding, perchance. There's no metric/distance in the treatment; the language of metrics is used in the purely metaphorical term "radius of abstraction", but that's it. The treatment is mathematical, but not numerical. Notably, the derivative metaphor isn't intended in any remotely numerical sense. The derivative metaphor also isn't meant to suggest a positive correlation between "sensitivity of the semantics to small program changes" and "expressive power". I have occasionally used the term "capricious" for a language in which small program changes are likely to produce abrupt discontinuities in the semantic landscape; at a guess, that seems likely to diminish abstractive power, being rather at odds with the intuition behind the term "smoothness".

The point of the example is that it doesn't fit expressiveness, because the languages are of equal expressive power (or, if one sets up the languages differently, of weakly equal expressive power); to distinguish them decisively using expressiveness one would have to restrict the expressive transformations so as to exclude the remove-"private"-keyword operation, which would probably leave nothing left but subsetting of languages (category Inc). To distinguish those two languages decisively, so that the distinction doesn't rely on fine points of the set-up nor on draconian restrictions on the class of transformations from one language to the other, I looked at whether the transformation between languages respects subsetting relationships between derived languages — which is (a specific instance of) my definition of abstractive power. Turns out, any macro transformation from the one language to the other decisively fails to preserve [subsetting] relationships beween corresponding derived languages.

OK, so do I understand

OK, so do I understand correctly that your definition of abstractiveness attempts to model how definitions in the prefix of a program affect the language accepted by the suffix of a program? If I add a primitive to my language that allows an arbitrary interpreter to processing the remaining token stream, why is that not maximally abstractive in your framework?

in two parts

I'll tackle that in two parts.

do I understand correctly that your definition of abstractiveness attempts to model how definitions in the prefix of a program affect the language accepted by the suffix of a program?

Hm. In my treatment there isn't really such a thing as a "complete program", just a sequence of texts, some of which may be observable, and the sequence may or may not be a prefix of some other allowed sequences. The set of suffixes that can be added to this prefix, forming these other allowed sequences, is itself a set of sequences of texts, which makes it a programming language, and I'm interested in how the prefix that reached this new language affects the properties of the new language.

If I add a primitive to my language that allows an arbitrary interpreter to processing the remaining token stream, why is that not maximally abstractive in your framework?

In order for the language to be more abstractively powerful than another language, or even to be more expressively powerful, it must be mapped into by a moprhism φ that satisfies given structural constraints on how it's allowed to rearrange programs. The usual structural constraint is category Macro, applying a macro/polynomial transformation uniformly to each term in the sequence. If instead one allowed arbitrary decidable morphisms, then it would presumably always be possible to find a morphism into the run-an-interpreter language — which shows that without structural constraints on φ, expressive power tends to devolve into computational power.

We might have a related approach

It turns out we've been tackling something similar. I work on an extensible languages tool called Silver. Our goal has been to develop a practical notion of language extension where extensions will reliably compose together, guaranteed no problems. That, of course, requires restricting the range of extensions that are possible. We've only relatively recently figured out the analyses and properties that extensions must satisfy to permit this composition property, and we've recently turned to looking at just how restrictive it is.

These properties essentially provide a dividing line between "this new language feature is a potential extension" vs "this feature constitutes a change to the host language itself." We've discovered the range of possible extensions varies considerably with the host language. Indeed there seems to be a few small tweaks that host languages require to allow useful extensions at all, and other changes that have really massive effects on what's possible. (We have a draft paper currently under review with some of these sorts of experiences in building an extensible C compiler. Hopefully I'll have some links mid-January, if anyone's interested.)

At any rate, it seems like John Shutt is trying to put restrictions on, for example, what macros are capable of, and thus expose a richer underlying structure of relationships between languages. Perhaps we have an alternative set of restrictions that get to a similar idea.

interesting

It does some like the two approaches might benefit each other.

Apply to C++

I wonder if your framework can apply to the extension of C to C++. I have quite a few examples! Here's one.

A language C contains base types I, and a postfix type constructor i*, so that if t in I, t in C, and if t in C then t* in C also. Let us now add the usual struct aggregate to get a new language C.

Now some wag comes along and adds template<T> to structs. C has the usual expressions but now we add terms like X<int> where X is a template struct, and int in I.

So far so good, this works. But now someone else comes along and tries to add a new type i&. The rule for the original template free C is simple enough: R is C extended by the set of types c& where c in C, i.e. references are only allowed at the top level. Now we add templates to that an all hell breaks loose because we can derive int&& which is nonsense.

This is actual C++. You cannot write this term but you can template expand to get it and it's defined as meaning int&.

So it's clear to me *intuitively* that the introduction of reference types limits the extensibility of the language by excluding coherent templates or vice versa.

Now please construct an argument that this is so in terms of your paper. I'm not looking for mathematics, just the way of *explaining* the problem. If we make references only part of function argument passing mechanics instead, what happens?

I have a list of faults in C++ a mile long of the kind "fails to be orthogonal", from unions of constructible types to the failure of templates to properly generalise C struct by disallowing incomplete types as arguments, thereby making combinatorial type recursion possible.

In some cases the extensions were bad but in most it is C that is so badly designed it already at a fixpoint: radius-of-abstraction (C) == C. In fact it's worse, K&R was almost there but introduction of typedefs in ISO C actually pushes it *outside* its radius of abstraction (you can't parse anything but whole programs so the notion of transforming a prefix is nonsense).