The Last Language?

In the last few years we've seen a return to languages that were invented in the '50, '60s, and '70s. Clojure, Scala, F#, and even Ruby are derivatives of much older languages, and do not represent new ideas. This begs an important question: Have we explored the language space? This question is not nearly so absurd as it sounds. We may, indeed, have completely explored all the different types of computer languages. It may well be that any new language invented will simply be a minor improvement of an older concept. In this talk, Uncle Bob Martin asks if perhaps it's time we stopped exploring that space, and simply picked "The Last Programming Language". What would that language we like? What attributes should it have? And is this idea wise?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Have we completely explored the language space?

No, we haven't.

An attempt at counterexample : I think we have much to do about typed metaprogramming -- in the sense of "program generation". Typed staged languages like MetaML and MetaOCaml have explored the construction of typed program expressions to be run later, but this covers only a relatively small part of a programming language grammar: we could want to generate patterns (in the ML/Haskell sense), type declarations, modules, etc. in a typed fashion. I know of some attempts at rich typed metaprogramming of languages with binders (eg. Beluga), but internalizing type declarations seems a step farther. Much metaprogramming today is still done at the syntactic level, with no help from the host language to give guarantees about the target program beyond syntactical correctness.

Of course, LtU abunds of other examples of "exploring unknown programming language space" such as Sean McDirmid's touch-based languages
and dmbarbour's Reactive Demand Programming.

PS: I'm sorry to react to the abstract of the talk only, rather than the talk content, but I can't get the video to play and wouldn't want to try to extract the interesting bits out of a one-hour video anyway. I would welcome a transcript, or at least the slides in a downloadable format.

In the end there can only be

In the end there can only be one.

Sorry, I couldn't resist.

No

Have we completely explored the PL design space?
Two words: "No, moron."

There are three issues here

1) Have we fully explored the design space for languages, 2) is a "last language" even possible, and 3) what would the last language look like.

I don't think we have fully explored the design space for languages but we might have explored the design space for language features. I don't think the idea of a "last language" is quite the heresy it once was. However, if such an animal does exist though that means you would have to have both C programmers and Smalltalk programmers agreeing that it is significantly better than their preferred language and I don't see that happening any time soon. This idea resurfaces every decade or so and Bob Martin has some interesting ideas about what such a language might look like but most of the presentation is more entertaining than informative. Here is what I understood Bob Martin's thoughts on what the "last language" might look like. The last language:

  • will not be under the control of a corporation.
  • will have garbage collection.
  • will not have goto's
  • will constrain how we use assignment (more functional)
  • will provide polymorphism
  • will not be multi-paradigm
  • will execute on a virtual machine
  • will allow access to existing frameworks.
  • will be fast.
  • will be textual with simple syntax
  • will be homoiconic

    Bob Martin also claimed that Static versus Dynamic typing doesn't matter (much) to real programmers writing real software so it wouldn't matter one way or the other in the last language.

  • Feature Space

    we might have explored the design space for language features

    Not even close! Unless you speak of features one at a time and in very high-level terms. But features interact, thus the feature space for N features is on the order of 2N. Consider, for example:

    • 'anticipates the future' is a potentially useful feature in the domains of planning, coordination, conflict avoidance, smooth animation, gesture recognition, visual target tracking, command and control. However, by itself, it might mean you're limited to querying some shared planner with an omniscient view, which is inconvenient and increases coupling.
    • 'anticipates the future in bounded space' might be a bit less flexible than anticipating multiple futures, and certainly limits how far we can look ahead (to a constant value), but does allow us to reconcile this feature for use in embedded systems, real-time programming, potentially for down-to-the-metal FPGA or DSP compilation.
    • 'compositionally anticipates the future in bounded space' renders anticipation scalable and meaningfully useful for coordination. It allows us to combine independent subprograms that target different parts of the problem domain, and to coordinate or avoid conflict by anticipating the behavior or needs of the other components. I've read that Lustre has this feature.
    • 'compositionally anticipates the future in bounded space in an open system' allows us to use this feature between independently maintained, federated services, i.e. without a global compile. This is renders the anticipation valuable for, say, command and control and coordination of large numbers of robots in real-time. My RDP paradigm will have this feature.

    In the case I described, the features interact synergistically. Most arbitrary combinations of features, however, are not effective. (Some - such as lazy evaluation with imperative side-effects - are even disastrous.) Some people have expressed interest in developing a contradiction matrix for software engineering.

    We are not anywhere near a 'last language'. I agree with your opinion that we'll be wanting a language that works well both near-the-metal and near-the-developer. But we're able to tweak the metal, too (e.g. more focus on FPGAs and other flexible computing substrates), and I expect we'll want to do so for high-efficiency computing (to improve battery life, support sensor-clouds and robotics, allow more kinetic and light-powered computation and pervasive computing).

    Obviously I meant one at a time

    Obviously I meant one by one because if you consider them together I don't think there is a valid distinction between the feature design space and the language design space. His list of features gives an idea of the level he was thinking about so that sort of shaped how I answered.

    Bob Martin was making a pragmatic argument based the fact that we really haven't seen anything truly unprecedented since the 60's or 70's. Obviously languages have been improving but they haven't been changing in fundamental ways, or so his argument goes. Long on examples and generalizations but no real substantive theoretical meat to the argument. Which is fine, that wasn't his purpose.

    Bob Martin said the only language that had all the features he expected was Clojure and that there was "no way in Hell" that was the last language but that maybe it held the seed of the last language. Not sure I agree with that but it did pique my interest about Clojure.

    Another way of thinking about the issue is to do the thought experiment where we are 100 years in the future and they use the "last language". I suspect that Bob Martin would claim that any programmer from today wouldn't have any conceptual difficulty picking up the changes and in very short order become a proficient programmer.

    Could be

    I suspect that Bob Martin would claim that any programmer from today wouldn't have any conceptual difficulty picking up the changes and in very short order become a proficient programmer.

    That's possible (assuming they already know Chinese). But hopefully they also think, "wow, this is so much better than the crap I used to use."

    This is a rather short term

    This is a rather short term view of the last language. If humanity survives for a significant amount of time, then natural language processing will reach levels comparable to what humans can do. So the last language will probably be a derivative of english.

    Really? Only Lawyerese could

    Really? Only Lawyerese could possibly be unambiguous enough.

    I'd quit CS over that.

    Execute on a virtual machine

    I'm puzzled how execution on a virtual machine would count as a language feature - isn't that an implementation option, or am I missing something?

    portable language

    Bah, you're just arguing semantics - not something a real programmer ever thinks about. How many digits of pi do you know? ;-)

    Bob Martin's argument for virtual machines was along lines of "because we don't want to be slaves to our hardware" and "we can spare the CPU". Virtual machines are a proven means to an end.

    I too have read only the abstract ...

    (Rant: documentation-by-video is a huge waste of everyone's time: I can read a transcript, conservatively, ten times as fast as watch a video. I can't devote an hour to every interesting-sounding new idea, but six minutes is no problem.)

    Three times so far the computing community has tried to invent universal general-purpose languages: PL/I, Algol 68, and C++. The first two each had their hour, but came to a bad end. C++ is still important, but no longer looks like the "last PL". A student of history would conclude that such languages will continue to appear infrequently but regularly, and as regularly continue to be superseded.

    Those who have been persuaded to think well of my design, require that it should fix our language, and put a stop to those alterations which time and chance have hitherto been suffered to make in it without opposition. With this consequence I will confess that I flattered myself for a while; but now begin to fear that I have indulged expectation which neither reason nor experience can justify. When we see men grow old and die at a certain time one after another, from century to century, we laugh at the elixir that promises to prolong life to a thousand years; and with equal justice may the lexicographer be derided, who being able to produce no example of a nation that has preserved their words and phrases from mutability, shall imagine that his dictionary can embalm his language, and secure it from corruption and decay, that it is in his power to change sublunary nature, or clear the world at once from folly, vanity, and affectation.

         —Preface to Samuel Johnson's dictionary (1755)

    Don't waste your time

    Being ill at home today I actually sat through most of the thing. His analysis is about as coherent as a 90 day weather forecast for central China based on the digit sum of the air humidity measurements in a California basement.

    Before watching the video I

    Before watching the video I did find your comment amusing and was wondering of how far you were exagerating your expression over your opinion about it.

    Now that I've watched the video, and no offense meant to Uncle Bob Martin whose previous works and other credentials are likely at least 20 times as much as mine, I do find he's missing the point (at least the one I've personally tried to put my deepest thoughts therein).

    For another analogy, it felt to me that it's as if he is looking at languages and language paradigms with a specific interest in diagnosing their diversity seemingly reaching a point of stagnation (hence his belief that "all has been explored already") ... but from the wrong symptoms.

    Put otherwise, it's as if after taking a snapshot of human activity on earth and looking at the recurring patterns of the ways businesses are built, along with a closed set of currencies ... you would allow yourself to deduce that all the different ways of making money have been invented already and you can now move on and devise for yourself how to get to the final equation that will make everyone happy without richness vs. poverty discrepancies anymore... but I shouldn't elaborate more on what that would mean (politically).

    So, yes, if one restricts oneself to considering the Von Neumann computer and Turing's model of computation only, "everything" has been said, sort of, already, because the only remaining specifics depend upon the arbitrary number of indirections and layers you decide to put between languages and processors to compute things for solving problems.

    But, as dmbarbour pointed out precisely: this knowledge really doesn't matter at all in the end, as you still haven't answered about the issue of how to compose and have all these languages and tools (which are just the same one thing looked at/sculpted in different angles and smoothness of curves for its general shape) get along well with each other for given, specific domain problems.

    That's where I believe Uncle Bob Martin is wrong: I think it's useless to try synthetize this into one last angle of view (to have everybody agree upon in the end). I seriously doubt that could ever happen, given the way people defend their tastes and preferences (and accumulated skills, their own personal "history") over this or that form vs. the common content (of problem solving approach) they can share otherwise.

    I surely know and agree with billions of other people of how to find food and shelter to sustain and protect myself and my family (problem is easy to solve; only two mandatory sub-tasks are: make money, respect The Law), but my brain and/or body imposes very strong limitations on the type of work activities I can commit myself to make my income honestly without unreasonable effort! (e.g., even 20 years ago, you couldn't have asked me to become a surgeon if I couldn't sustain the view of blood, as much interesting the medical field can be.)

    like the table of elements?

    You seem to be saying this is more like the table of elements which was really the beginning of chemistry. In contrast, Bob Martin seems to be saying that we have all the pieces so maybe we are nearing the end of language design.

    I guess the question is whether there is one way to put the pieces together that, after the fact, is obviously correct and the-way-to-do-it to almost everyone or if there are many different ways to combine language features resulting in fundamentally different (obviously not in the Turing sense) languages.

    I've watched the video

    I've watched the video attentively because I was curious to hear how he could have come to such an intuition. I know next to nothing to Clojure, though I've read good things about its design decision specifics many times. I sure wish I had more time to invest to catch up on languages that receive good criticism, of course.

    But even assuming Bob Martin is right on seeing Clojure has a nice synthesis PL, serving as a sort of witness of "the closure of all the useful paradigms" (just my own words/rephrasing of Bob's message, here) -- though no doubt many people might find this debatable and prefer other language instead, btw -- what I meant is that I seriously doubt it's even relevant.

    Let's say one is able to demonstrate formally that the core set of useful programming paradigms to practice on top of today's computer technology happens to be the one already put in best synthesis through the Clojure looking glass (either as is, or as a basis for future Clojure-influenced language yet to define, implement, etc).

    I know humans a bit, I am one of them:

    I would never bet a cent on the impossibility that someone, and very shortly thereafter, comes up with a better idea that eventually attract attention and adds to what was believed as being the closed set of useful paradigms... but then is not, and we're back to square one.

    I have no idea if Bob Martin is actually correct whether today's Clojure is a sufficient foundation to back up his idea of attempting such a language design convergence effort.

    The only thing I know for sure is my basic assumption is the exact opposite of his: TM-computation model-based computer languages as I understand them today are, IMHO, doomed to diverge from each other in both syntax and semantics, completely disregarding their apparent, but very deceptive (IMHO, again) similarities.

    My own personal "research" interest is in the ways we could possibly investigate to limit the "damages" of language and tools designers' inventivity (damaging when not well-understood, resp. not well-documented, at programmers' side, resp. tool vendors' side) w.r.t. the interoperability constraints the importance of which is, from my own observation, ever increasing all throughout the new use cases for putting computers in human activities and having those compute and communicate/transform "passive data" (e.g., final reports, screen renderings, etc) thru these PLs.

    barely scratched the surface

    We may (I suspect) have only barely scratched the surface of what can be done in a PL to support "abstraction", by which I mean in its most general possible sense the use of facilities within the language to modify the language. (A particularly old and ubiquitous form of "abstraction" support is the ability to define new procedures.) Although, for example, OO languages have vastly more abstractive power than, say, Pascal, I'm inclined to think on an absolute scale all mainstream languages are abstractively pretty feeble.

    If there can be such a thing as a last language, I think it would have to be one with effectively infinite abstractive power, so that it could be modified-from-within to usefullly address any possible problem domain. I've no idea whether that's possible, but I see no evidence it isn't. Curiously, with the success of such a language, it seems the very concept of "programming languages" would fade into the background, which is the usual fate of old technologies when they pass through the curtain of a technological singularity.

    Composition and Integration before Abstraction

    The PL design space is vast, subtle, and largely unexplored, we agree.

    I believe there are a number of properties we should prioritize above abstraction.

    For example, if we ever want to combine solutions from different problem domains (and we very often do!) then it is necessary that independently developed abstractions can coherently work together. This need for integration should constrain our abstractive power.

    With composition, we have uniform operators for constructing programs, and the ability to inductively reason about useful properties of the resulting program. This is the basis for local reasoning, since we don't need to peek inside the operands to reason about the composition. This is the basis for scalable development, since we can build large programs without keeping everything in our heads and validate independent subprograms. This is the basis for open systems development since we can reason about how our programs behave when integrating hidden elements based on a few known high-level properties. Composition is simply more valuable than abstraction.

    I think that the concerns for composition and integration can, and should, constrain our pursuit of abstractive power. I don't object to user-defined syntax, but I believe it must be tamed, perhaps with locality to a module.

    That said, we can get plenty of abstractive power without sacrificing composition and integration. A few techniques include generalized arrows, dependent typing, constructive logic, term rewrite systems, and generative grammars.

    We don't need infinite abstractive power. Just enough for the problems we'll actually encounter, constrained by the need to integrate solutions and reason about their composition.

    Makes sense to me. But don't

    Makes sense to me. But don't you think that composition and integration are "simply", in essence, just yet another form of abstraction lever for languages and the processors/interpreters that give a meaning to their phrases thru the computation results?

    My belief is your works, ideas (and of others') are mostly acknowledging the fact that this very flavor of abstraction has precisely been neglected for too long and it's time to address it explicitly, as integral part of the languages and tools' design process, instead of implicitly or in ad hoc fashions (or even worse, not at all, when downright ignored).

    I tend to agree that

    I tend to agree that composition should be usefully viewable as a form of [abstraction] — though I freely admit I haven't yet attempted to work out how one might do that within the framework of my mathematical theory of abstraction. (It'd be beautiful if one could also usefully view abstraction as composition; no wonder mathematics is often compared to poetry...)

    [edit: composition -> abstraction (!)]

    Composition vs. Abstraction (Terminology)

    Composition is not a form of abstraction. With composition, you have 'elements' that you are combining. With abstraction, you don't have an element yet, just a new way to later create one.

    Consider function composition. We compose functions to create new functions. Be careful to avoid assumptions: I have not said whether the language supports first-class functions, nor have I indicated whether the composition is 'point-free'. We can build some useful programs just by composing functions that come with the language. But that would be inconvenient. We want the ability to take a function and give it a name so that we may use it elsewhere in the program without massive copy-paste efforts.

    Hopefully, that should clarify the difference. We often think of 'functions' as being abstractions due to the connotations in day-to-day usage. But functions are not abstractions. The abstraction is when we name a function for later use. John explains this as a language transform - i.e. after abstraction, we have a new language that is almost exactly like the old one but has a new term in it. (More detail on his blog.)

    It is worth noting that certain semantics, together, can give us abstractions. For example, if we do have first-class functions and we can trap the input to a function in a variable (lambda-calculus style), then we can model 'let' statements. We could benefit from a little syntactic sugar, but this is primarily a semantic approach to abstraction: the syntactic sugar requires only a local transform and is tightly coupled to a semantic form in a 'host' language.

    John Shutt would like to break the chains of semantics, e.g. using fexprs to reach under-the-hood to wrangle and mutate the vital organs of a previously meaningful subprogram. Such a technique is workable, but I believe it leads to monolithic, tightly coupled applications that are not harmonious with nature. I have just now found a portmanteau that properly conveys my opinion of the subject: frankenprogramming.

    Thank you for fixing my

    Thank you for fixing my possible terminology usage issue, there. I might have confused the two notions for a long time. I'm not sure whether I've put enough thoughts in this, to even agree if only with myself in a definitive way...

    The abstraction is when we name a function for later use. John explains this as a language transform - i.e. after abstraction, we have a new language that is almost exactly like the old one but has a new term in it.

    Now this, too, will definitely be much helpful to me, as it does completely relate to this idea I had of a high level algebra basis of a language-/transform-interop (may-be-)infrastructure (granted, still in big need of clarification for what I tried to express, when time will allow...)

    And yes, I also have a lot to catch up with re: John's works on these matters. Thank you for the links.

    John Shutt would like to

    John Shutt would like to break the chains of semantics, e.g. using fexprs to reach under-the-hood to wrangle and mutate the vital organs of a previously meaningful subprogram.

    No, I wouldn't.

    There are two distinct issues here:

    • the general mathematical framework I've been using to study programming language facilities, and
    • the specific facilities I've been studying, notably fexprs.

    There may be some misunderstandings about the general mathematics, would could in turn muddle discussion of properties of fexprs (and may even have already done so).  So I'd better address the general stuff first.

    First thing to keep in mind:  the word abstraction, in association with the general mathematical framework, is at least to some extent a legacy term.  I devised this theoretical approach to study properties of programming languages because I wanted a "theory of abstraction", and I didn't think (and still don't) the most usual theoretical approaches were structurally capable of discerning the essence of abstractive power.  My preliminary explorations of the approach so far have all focused on abstractive power (which is how it relates to my blog post), so the only name I have for it is "abstraction theory".  But.  After I finally succeeded in casting this approach as rigorous math (techreport), I realized that although the word abstraction and its kin occur often in the informal discussion, their only use in the formal definitions is well downstream in the development, as a less specific shorthand for the key relation between languages with expressive structure ("B C-expresses A for observables O (or, B is as abstractive as A)"). 

    Second thing to keep in mind:  When I do talk about "abstractive power", I'm not talking about ability to violate encapsulation, such as reading and even writing "private" data.  That would be expressive power.  Abstractive power, as I've said, is about ability to modify the language from within so as to get to various other languages.  Modifying the language is different from modifying particular objects within a program.  For example, when you define a new module, you might want to make its internal state private, or make it public.  A language that forces you to make it private is abstractively weaker than a language that allows you, as the author of the module, to choose whether you want to make it private or not.  A language the forces you to make it public is also abstractively weaker than one that lets you choose.  The abstractive power is, essentially, the power to choose.  This is a subtle thing, because the choice wouldn't be meaningful if the privacy, once chosen, weren't binding on future programmers:  if you, the author of that module, choose to make its internal state private, there can't be any choice that could be made downstream that would undo that privacy; it wouldn't really be privacy if it could be undone.

    Lightning sketch:  A programming language is viewed as, essentially, an infinite-state machine, where the states don't have identity, so really all that matters is the sequences of labels on the transitions (the labels being terms over a CFG).  Because the labels aren't limited to source code, this view can model arbitrary meta-information ("behavior"); some subset of possible terms are designated observables.  You can view the overall shape of this infinite machine through the lens of an expressiveness relation between its states (such as Felleisen's expressiveness), and that's a language with expressive structure.  My abstractive power relation is a relation between the shapes of different languages-with-expressive-structure.

    Although this approach is explicitly looking only at these vast networks of transitions between languages, never attending directly to fine substructures within a term such as, say, functions being composed with each other, nevertheless the rules of interaction between fine substructures should be expected to affect the overall shape of those vast networks.  So one should be able to derive formal results about the 'abstractive' consequences of fine-structure language features.  Which is (in part) why I suspect it may be possible to study composition using this approach.

    The fexpr-related issues raised by your last paragraph are sufficiently separate that they'd be well placed in a separate post, which is what I'll do with them.

    Modifying Language vs. Program

    Modifying the language is different from modifying particular objects within a program.

    I've found that, in an open system, the ideas get to be quite entangled. We cannot transform a reference, only wrap them. We can transform the language for a few particular objects in the system, but only those we are responsible for maintaining, and that is further constrained by compatibility and integration requirements.

    Conversely, adding services or plugins to a system might be considered to be a form of language extension and abstraction. If we have easy code distribution, we can feasibly 'abstract' new network protocols, overlays, and distributed frameworks.

    We can also understand language as a first-class object, especially in the context of staged programming models.

    can't be any choice that could be made downstream that would undo that privacy

    Above, you did not ask merely for "abstractive power". You asked for "infinite abstractive power". You will never see infinite abstractive power if other developers can choose to deny that freedom.

    Our ability to make choices will always be constrained in an open system. Compatibility, integration, reuse, et cetera will constrain us, both upstream and downstream. Developers have no need for a 'pretense' of abstractive power that they cannot effectively utilize in practice. Indeed, it is better if we remove an illusion of choice where there is none to be made.

    I think if you ever work out a 'bill of rights' for abstraction, to give developers freedom to abstract insofar as it does not abridge this right in other developers, you will end up favoring composition and integration above abstraction. We can achieve a lot of real, practical abstractive power if we keep our priorities in order.

    Well, of course we have

    Well, of course we have ... to some extent, at least.

    But by "extent" it's all about how the question is phrased, indeed.

    Language space? I take it it's about PLs, right? (haven't had the time to watch the video yet)

    Then, if the word "programming" is used, I'd also take it "by default", that it's about the current "binary computer" technology space?

    Finally, then, yes the last language does certainly exist: isn't it any one that is capable to encode in one syntactical form or another the most general computation model (known as of today) we aim to express with the language phrases thereof?

    So I'd rather conclude it's useless to try think of getting to meet/recognize "the last language" there: for it's been known ever since Turing's works and it just so happens that this last language has many flavors and tastes in the way we write it, but is essentially the same working concept shared and used by our brains:

    it's much like with our 26 letter alphabet and English or French, where my hand writing and use of english words is not especially better or worse than anybody's else, but what does make a difference sometimes (though of course very rarely(*)), is whether or not the ideas I try to convey in English vs., say, French, require less effort from me in the former than in the latter, precisely because the English corpus of phrases is better prepared to receive them (e.g., with coined up words, acronym jargon, etc.) in this or that specific instance of ideas (i.e., "computations").

    AFAIC, long story short, if one asks me, I'm much more interested in seeking for "The Last Interoperability between (TC-)Languages" (thus, which are unavoidably in legion of forms, but essentially the very same thing at work in machines) than in "The Last Language" that I look at as a rather trivial and unconstructive question, actually (in my answer's PoV above, that is).

    [Edit] (*) yes, I did write "rarely" up there, that I think is consistent with my other belief that the "language interoperability" topic is way much less of an issue for natural languages than it is for PLs, given my basic assumption that our brains' computation power is strictly greater than TMs, precisely (then it makes sense to me to try seek for some global interoperability improvement, if we can, constructively, between TC languages, as opposed to natural languages where our brain power can find mysterious ways to workaround communication ambiguities ... I suspect not unrelated to the fact we have those five senses, btw)

    We've found the last language.

    We know, with a reasonable level of certainty, that it exists somewhere and sometime in our universe. Maybe our galaxy, even.

    Embrace the Negative Paradigms!

    Bob Martin offers some excellent lines (at 27m22s):

    All the paradigms have been subtractive, not additive. All the paradigms we value so much have taken things away from us, not given us anything.

    Is there anything left to take away? goto, indirect goto, assignment, infinite size modules. Is there anything left? I hope not. If there's another paradigm, we may not have any code any more.

    I would happily offer many suggestions about what to take away: divergent functions, message passing and events, synchronous IO, global namespaces, global state, ambient authority, implicit time, arbitrary delay and recursion, explicit iteration and for-loops, exceptions, randomness, floating point equality.

    Recently, I've been working out how to take away even local state, i.e. eliminating 'new' calls from the runtime behavior of a program. My motivation is to resolve the state vs. upgrade problem, as it applies to live programming in eternal systems. Rather than modeling state as living inside the application, all state belongs to an abstract machine provided by the runtime then influenced by the application.

    But being rid of local state is only feasible because I already eliminate most need for state: caching, memoization, buffering, and queuing updates are easy to achieve in RDP without explicit state; there is no need for iterators, temporal semantics offer me synchronization without stateful semaphores or workflows, et cetera. Also, the above solution would be little different from global state if I had not also eliminated the global namespace and ambient authority.

    The lesson: eliminating cruft and complexity allows me to find even more 'features' to remove. Though, I shouldn't be surprised: I should have learned this much the first time I cleaned a garage.

    Today, we program with these massive Rube Goldberg languages. Many features seem essential because, if you remove any one of them, the machine stops working. But don't let that deter you! Subtract, and find a new idiom or solution for the pieces that fail. Subtract. Subtract, and simplify.

    If we ever reach a point where we do "not have any code any more", but we can still meet user requirements, we know we're done!

    (Aside: One might propose that the ultimate subtractive language is subleq. But then we need to make it easier to specify the data...)

    Rube Goldberg

    You've got it backwards I think.

    I don't believe current general purpose languages are like Rube Goldberg machines: they have enough features that you can encode what you want rather directly, even if it requires a lot of verbosity. On the other hand, when you use a stripped down language (end-user or the old Lotus Notes Script), then you really do start programming Rube Goldberg machines!

    My languages always tend to be in the later space where rube goldberg machines are required to do something, b/c the languages are not general purpose. So signals are great for continuous interactions, ah, but what about those discrete actions like pushing a button? Ugh. You can do it, but its not pretty. The tricky part is designing a higher-level language that is somehow flexible enough to avoid rube goldberg machines.

    Step Back and See the Ugliness

    Rube Goldberg machines do not obviously appear to be Rube Goldberg machines when we are working on one small component of the machine. Individual components are simple and direct. We must stand back to see the complicated, fragile mess.

    I agree with your observations on 'stripped down' languages. Suppose we took a 'general purpose' toolkit for building Rube Goldberg machines. Then we strip it down, remove the 'dangerous' tools and materials - but don't replace them with anything. Now, our engineers need to be 'clever' if they wish to get work done - more so than before.

    The failures of those 'stripped down' languages should teach their own lesson: It is easy to aim for simple and achieve simplistic. Effective 'negative paradigm' design is not obvious.

    signals are great for continuous interactions, ah, but what about those discrete actions like pushing a button? Ugh. You can do it, but its not pretty.

    It can be pretty to model a button with a signal. But pretty things might seem out of place if the setting isn't right. I might ask: should I attribute the ugliness to the button model? Or should I attribute the ugliness to the setting?

    Buttons on modern PS3 controllers are analog, e.g. press harder to jump higher. We use duration of a button behaviors in significant ways, e.g. hold button to control camera zoom. We use multi-button behaviors: ctrl+alt+delete, music board synthesizers, mouse+keyboard actions. Buttons have physical state that varies over time: up or down.

    I am finding it very difficult, Sean, to attribute any ugliness to the modeling of buttons with signals.

    The tricky part is designing a higher-level language that is somehow flexible enough to avoid rube goldberg machines.

    Yes. We want powerful facilities for abstraction, and we want an effective set of compositional properties. But the real trick is to look at this problem and not immediately think 'tradeoff'. Language design is not a zero-sum game.

    It depends on what you do

    It depends on what you do with your signals. If you want something to popup when you press a button, which is the common case, then you have to come up with a way getting that thing to popup. Yes, we could imagine the buttons are analog with continuous states, but that still does not help us get that popup on the screen and stay there once we've pushed it!

    The real problem is that the world is messy and nice clean programming models can't deal with that messiness very well. They instead would rather reinvent the world to be as clean as they are, but the world can't be changed so easily.

    [edit] A general purpose language doesn't suffer from this problem (in general) since you can always express exactly what you want to do, you have complete control. A rube goldberg machine, by definition, is some sort of hack to deal with inflexibility in the language, perverting some behavior to do something else. Sort of like doing math in Kodu by creating and destroying robots and then having another robot "seeing" them to ascertain some sort of global condition (that you can't express directly in Kodu).

    Cancer of the Signal State Membrane

    It seems you have difficulty gathering signals into state. I vaguely recall similar troubles in Coding at the Speed of Touch, though your state model was not what interested me in reading so I felt no need to comment on it.

    The state model I currently use in RDP is based on accumulators:

    newAccum :: (Signal s, Signal s', SigTime s ~ SigTime s', HasLinkRef v a)
      => (st -> (SigTime s) -> (s x) -> (s' st)) -- accum function
      -> st -- initial state
      -> (a (Unit a) (s x))     -- provides signal to accumulate
      -> v (a (Unit a) (s' st)) -- provides state signal
    

    This is a fairly simple model with some nice properties. Foremost, it is compositional: the input is a signal source, and the output is a signal source. I also have a clear, monadic start-time and initial state for the accumulator, so I avoid the semantic concern common in FRP models with accumulating history. Since the accumulator controls its own state, there are no concurrency conflicts. External influence on state is indirect.

    The accumulation function itself might be an integral or something else. I allow multiple signal types to live in my model, i.e. both discrete and continuous signals can coexist in the same program. [To clarify, since you and I use 'discrete' differently: a discrete signal is one that is guaranteed to hold a constant value between updates, and to have a finite count of updates during any period of continuous time. You use 'discrete' primarily to describe events. I call events 'instantaneous'. It's a bit confusing, but I think we can learn one another's languages here.] A non-analog button would probably be modeled by a discrete signal.

    Now, to model the persistent popup problem (in no particular order, since I don't think sequentially):

    • Create an Accumulator. Dependencies: something to observe, accumulation function.
    • Create a demand-monitor to observe. A demand monitor is basically a mirror for signals: a set of signals go in, a signal of sets comes out. Except it's split into input and output halves, which avoids the whole feedback issue. I affectionally call the halves 'De' and 'Mon', as in (de,mon) <- newDemandMonitor
    • Accumulator observes Mon. De is handed to any agent whom should have rights to influence the GUI state.
    • The accumulation function will gather high-level GUI control signals, such as 'raise popup (id, location, size, text)' and 'lower popup(id)'. When there is a confusing signal, such as demands to raise and lower occurring concurrently (i.e. from two different agents), the accumulator function will choose some arbitrary solution, perhaps favoring inertia.
    • We need a 'controller' agent to observe the button signal and turn it into a high-level GUI control signal on De. This agent must be granted a reference to De and to the appropriate button cap.
    • We need a 'view' agent that observes the accumulator's signal, filters and translates into a scene-graph signal, and pushes the scene-graph signal to the rendering element.

    In summary: The button's state signal indirectly causes a GUI control signal, which is promptly accumulated, influencing accumulator state, which is observed by a view agent, translated into a scene-graph signal, which is pushed upon the renderer. This all happens continuously, though the rendering element might sample the model discretely.

    The above sketch is a few modules shy of an implementation, but I don't foresee any trouble with it. There is a bit more to be done - e.g. I recently sketched some code for redrawing only dirty rectangles.

    the world is messy and nice clean programming models can't deal with that messiness very well

    Our models of the world are messy. The world itself seems to be built on some pristine physics beyond our ken, unencumbered by human fallacy.

    We will face much ugliness and semantic conflict for interaction and interop between human models. So our languages should be effective at this. Interacting with diverse models in an open, federated system was among the founding requirements that got me started on language design in 2003.

    But I don't agree with your conclusion. We have no need to add to the mess. Garbage-in garbage-out is acceptable. Multiplying the garbage is not. First, do no harm. This means: predictable failure modes, graceful degradation, resilience, and continuous consistency management. I've already described one example of the latter: the accumulator will continuously decide what happens when there are conflicting signals for whether to raise a popup or kill it. This effectively dodges the issue of race-conditions, lost or out-of-order events, and so on.

    Even better if we could filter the mess and extract the useful signal. That, I leave to developers with specific domain knowledge.

    like doing math in Kodu by creating and destroying robots and then having another robot "seeing" them to ascertain some sort of global condition

    GlaDOS approves.

    [addendum] My mention of Rube Goldberg machines was artistic hyperbole, and perhaps that analogy has run its course. I would agree with a claim that developers need access to Turing-powerful computation facilities.

    Given an accumulator and real-time functions, for example, I can model Turing-complete calculations. The extra hoop for developers is rarely needed and very useful: the computation will be incremental, and it becomes easy to add explicit job control (pause, reset, etc.). In context of my interests, the job control benefits are not trivial - i.e. I'm interested in open distributed systems programming, where there is no single process to kill. Incremental computation allows us to more easily share a partial solution, which is useful in a variety of problem domains (UI, AI, control systems, etc.).

    I have not suggested we limit power, only that we control it. We do that by removing features. But we must sometimes replace them with a tamer alternative.

    First, YinYang doesn't have

    First, YinYang doesn't have the problem I described, which is why like the behavior-based programming model so much. You just specify Pushed(Button), Open(Window) and you are done. Actually, it really should just be Button.Pushed = Open(Window), but I'm not there yet.

    Second, you have a powerful model, but it doesn't seem to be very accessible. Perhaps this is an example of the Haskell-problem: a language that requires such logical minds that only a handful of people on Earth can really take advantage of it and realize its beauty. This isn't the Rube Goldberg problem, to be sure, that is completely something else.

    Third, the universe might be based on a simple model of computation (to borrow from Wolfram), but the model has been running for billions of years. The state of the current universe is extremely complex, although we could apply some laws of physics to predict how this state will evolve, we have no chance (right now), of figuring out how the universe got in its state, and we don't know enough about this state to make accurate predictions. Instead, at best, we can react to what we know about the universe, and must forgo a perfect model of it. Related: Elephants don't play chess.

    Pushing Buttons

    I do not know the meanings of Button.Pushed or Open(Window), but I guess you mean here that 'Button.Pushed' is a signal and 'Open' describes a 'one-time' action from YinYang. In that case, you are transforming a signal into a stateful event. Event-based programming has its own problems, but not too bad for local tablet games.

    Earlier conceptions of my RDP model were similar: developers simply specified arbitrary conditions, and the agents would eventfully 'become' a new agent when those conditions are met - i.e. a reactive finite state machine. I eventually chose against this, because it hinders both anticipation and runtime upgrade. So I developed accumulators, which allow anticipation. I'm still working on the upgrade issue.

    My model is very simple, and I expect people will find it intuitive. The few people I've managed to sit down were able to grasp the concepts very quickly while working through some simple pseudo-code problems. Of course, even OO might seem inaccessible to a Smalltalk programmer if presented as a model in Haskell. I must eventually develop a reactive demand programming language.

    I'm not sure what you're objecting to on point 3. We don't need to know the state of the universe to do useful computation. We only need to know the states of our sensors, and that's a lot more accessible to us.

    YinYang isn't based on

    YinYang isn't based on signals (behaviors are quite different), Superglue was. If to say, the event maps one to one with action, then there was no problem with signals; all details can be hidden in the plumbing. If you needed to open that plumbing up, that is when you got into trouble with complexity. As a result, I went to great pains to avoid opening up the plumbing, hence the Rube Goldberg machines...Ah, whenever I think of Rube Goldberg Machines, I can hear the Looney Toons' Rube Goldberg theme* in my head.

    YinYang is incredibly stateful, every single tile that executes can add its own state to the executing object, the state is initialized on first execution, updated as the behavior is re-executed, and cleaned up when the tile is no longer in execution. I feel liberated since I can now not feel bad about state, it is very well modularized, and state was never really the problem, it was the poor modularization of state.

    On point 3, you were claiming that the world was somehow elegant, which implied to me that we could have very good models of it. But now you are claiming otherwise, maybe I misunderstood your point.

    * Powerhouse by Raymond Scott

    Signals and YinYang

    I would say that YinYang has signals. A tile can observe state in its board. The value of that state changes over time. A signal is a value that changes over time. Therefore, observing the value of state over time implies a signal. But that is implicit. If you ever try to formalize YinYang, you would benefit from modeling signals more explicitly.

    Even well-modularized state can cause problems. (1) State can become inconsistent with an observed resource (especially under conditions such as disruption, delay, temporal indeterminism). (2) State often makes it difficult to modify the system in a cross-cutting way (e.g. editing the code for a particular tile) without losing important work.

    The world is elegant, but that doesn't imply we can have very good models of it. There's a difference between creating a model and living in one. Since you've had your 'Elephant' reference for the day, see Blind men and an Elephant.

    Someday, I will make the

    Someday, I will make the argument that behaviors/tiles are fundamentally better than signals (note that Conal calls also his continuous signals "behaviors," but mine are different, more like Brooks'). It should be more obvious in the next draft of my paper (I'll post that on Monday or Tuesday), but I still have some more deep thinking to do. There are things you can do to protect your state, basically heavily encapsulate it, but you are correct that editing the code for a tile, at least one that is atomic (implemented in C# code), requires reloading the tile. But otherwise live programming should be a bit more robust in YinYang than SuperGlue when I finally get there.

    Nice parable. Perhaps one day we can tell the tale of the blind language designers, each with their own philosophies based on their experience, each is right according to their own perception but all missing the complete picture.

    Behaviors need Signals

    Behaviors describe continuous observation and influence over a pre-existing environment. Signals describe values over time. To observe an environment, one must observe a property of that environment. To continuously influence an environment, one must manipulate a property observed by some element in the environment. A signal is simply a value or property over time. Thus, behaviors imply signals both as input and output.

    Signals serve the same role in a behavior model as messages serve in an actors model. They are the basis for communication. Just as there are many roles for messages (command, query, reply), there are many roles for signals (control, live query, response).

    Claiming 'behaviors are better than signals' is absurd, analogous to a claim that 'actors are better than messages'.

    Reactive demand programming has behaviors. Indeed, agents in an RDP system are very analogous to tiles in YinYang. The main differences are in how they are structured: YinYang structures tiles in a hierarchical containment relationship, supporting a subsumption architecture. RDP agents are structured in accordance with the object capability model for secure service mashups, and duration coupling for resource control.

    SuperGlue was a mix of paradigms: OO, imperative, reactive. IIRC, conditional statements, state, and effects were only provided in one of the paradigms (imperative) so you were forced to 'open' signals to make useful decisions or do other interesting things. I don't believe that the signals were the problem, there.

    An explicit model of signals is very valuable for a behavior model, especially if you wish to control timing and consistency while allowing a high level of parallelism. You would need to 'open' the signals for certain FFI and legacy library integration. But the basic library of behaviors should mean that regular users never open signals.

    [edited to make it shorter]

    live programming should be a bit more robust in YinYang than SuperGlue

    Two observations I've made: it is difficult to manage a live program if many 'new' objects are being constructed at runtime (the program can get away from the developer, esp. if distributed), and we want the ability to transition state across changes in both client and server (e.g. keep documents, data, subscriptions). This has led me to resurrect techniques for 'externalizing' state rather than encapsulating it. External is not global - i.e. no need for ambient authority. It simply means that access is separate from life-cycle.

    If all state is external, then an application would never call 'new'. It would, instead, 'discover' state existing in an abstract machine or database, and manipulate that state. This allows us to model live programming as simply generating a new application that picks up where the last one left off. For changes that require a reformat, the IDE would raise a dialog asking developers for advice on how to transition the existing state (with a lot of common strategies available).

    That's my current vision of how to get 'robust' live programming. I think it will also help with orthogonal persistence. This does involve replacing the 'newAccum' I mentioned not very long ago as being how I currently model state.

    next draft of my paper

    I look forward to it.

    I look forward to it. Its

    I look forward to it.

    Its up.

    The lesson: eliminating

    The lesson: eliminating cruft and complexity allows me to find even more 'features' to remove. Though, I shouldn't be surprised: I should have learned this much the first time I cleaned a garage.

    Don't you think that if you find yourself with the need of removing more and more features as time passes (e.g., compared to what you have thought of removing in the language design's state-of-affairs surrounding you 30 years ago) it's precisely because:

    1. the number of languages, implementations, and interop use cases of the processing tools grew totally out of your own control;

    2. those bad boys in (1) also eventually decided 15 years later to go global and to be able to hop in and out of your by-that-time-not-even-floppy disk-equipped- box

    ?

    If so, my guess is that much as in the analogy for your garage (and mine), the cleanup process effort's estimate and planning is highly dependent upon the number of folks having access to your garage and what are their interests for the natures of things they toss and store into it... ;-)

    Legacy and Compatibility

    I agree (if I'm reading your point correctly): it is very difficult to remove a feature from a language that is already in use, and any new language should interop effectively with existing systems.

    There are a lot of strategies for interop, though, and many ways to support 'legacy interaction' without compromising the language. It is a non-trivial issue, but I am very satisfied with my solution to it.

    Absurd reasoning

    His reasoning is just absurd. Language evolution is about finding better and more high-level abstractions. Picking a couple of random examples where that allowed getting rid of some low-level feature, and making a sweeping generalisation from there? I could use a similar line of argument to conclude that all language evolution ultimately is about writing more and more colons (you know, machine code: no colons; assembly: some; Pascal: some more; ML: double colons for cons; Haskell: omnipresent double colons for types; Scala: triple colons!).

    His story about typing is equally silly, completely self-contradictory, and he doesn't even notice.

    Better Abstractions

    His observation that progress is made by discovering liberating constraints is spot on. That is how we achieve those better abstractions - i.e. abstractions that we can reason about, compose, reuse, integrate, scale, and optimize.

    I am certain Bob Martin wasn't thinking of it in such positive terms. I call his lines 'excellent' because they speak of a greater philosophy than he seems to recognize. I especially like the last line I quoted: "If there's another paradigm, we may not have any code any more." If our next paradigm lets us meet general purpose requirements without any code, that would be more ideal than any paradigm I can imagine!

    With regards to his argument, I agree: Bob Martin's argument was classic fallacy, cherry-picking of evidence and examples to draw the conclusions he wants, combined with some ridiculous 'contagion' arguments (equivalent to: C is the last language because even Haskell uses '==' for equality, which is C syntax).

    you're both right

    To me paradigms are mostly about eliminating or limiting a very general feature so the source code better reflects the intention of the programmer. Obviously it must be replaced by something or we need to make clear how it is being limited so we are also adding something but it is the flip side of the same coin. Is there some Paradigm you don't think that applies to?

    Rube Goldberg

    Wrong place

    There is no last language.

    Can we pick a Last language? No. That's a stupid idea. Programming Languages, like Natural Languages, are subject to change over time. The process is less organic and defocused, of course; it's measured in terms of implementations used by many rather than in terms of idioms used by individuals. But the same principles apply, IMO. There can be no last language while technically creative people exist.

    There are entire syntactic paradigms we haven't explored or have barely explored -- essentially any language whose parse pattern doesn't reduce to a tree (funges, DAGs, etc) is so far outside canon that we don't even have terms to express its syntax.

    There are semantic paradigms we barely recognize and have never coded in terms of (Petri Nets, for example, are a nice abstraction which could be the basis of theory for parallel processing and sequential allocation of finite resources). There are others we've coded in only a little bit, where an infinite variety of possible enhancements remain unexplored (For example Unification Languages where it's possible to restrict the "universe" considered when attacking a subproblem).

    Here's what I think we should be working on, which, so far, we aren't. When we build abstraction on top of abstraction on top of abstraction, and do things in terms of FFI's and translation layers between modules written in different languages (and we do this more and more often as our projects get larger and larger) we often wind up with a horribly inefficient implementation of functionality -- written in terms of abstractions that aren't optimized in the specific case because they're written for a general case, or aren't optimized because they're written in a language which doesn't provide guarantees of properties which the specific program has anyway, or because there's translation code that moves data between different representations or runtime environments needed by code compiled from different languages.

    That, IMO, is what we need to fix. We need to find ways to efficiently optimize what's there -- treating things that the program does not do as sufficient guarantees for optimizing the things it does do (optimizing the functional case when something like assignment *does* not happen, because the programmer does not do it, as opposed to only the cases where we can prove it *cannot* happen, because the language forbids it), leaving out representation switches when they can be made unnecessary by reorganizing code in terms of the representations available, etc.

    In short, we are doing composition, integration, and abstraction badly (in terms of code size, speed, etc), and we need to figure out some serious theory and practice about how to do them well.

    Separation of Abstraction and Performance

    I think performance issues are caused more often by bad domain models or concurrency models than by inefficient implementations.

    However, I do share your position: performance-aware developers must feel free to use the abstraction and composition features of the language without justified paranoia that performance will suffer. Otherwise, performance-aware developers feel systematic pressure to create monolithic programs with a lot of hand-specialized code. And we really want to avoid systematic pressure to do the wrong thing.

    Eggs Ackley.

    I think your title absolutely nailed the idea I was going for; Separation of Abstraction from Performance. That's the grail.

    Just as a tremendously simplified example, consider someone implementing a stack of integers for some reason. He uses a library's "list" abstraction. It's implemented as a doubly linked list with individual nodes dynamically allocated, and a pointer to potentially variable-size data in each node.

    Because our programmer doesn't ever use the back links, the code to keep them updated ought not be generated. Also, the declaration of the back links in the datatype ought not cause memory to be allocated in the nodes.

    Because he's storing constant-size integers, the indirection to potentially variable-size data should not happen. Instead, the integer should be stored immediate, in the list node.

    And finally, if it can be proven (or if the code asserts after each insertion) that there are no more than 60 elements in the list (perhaps corresponding to minutes in an hour, seconds in a minute, degrees in a 1/6-of-a-circle arc, or some other feature of the problem domain), then instead of dynamically allocated individual nodes, the entire list ought to be allocated in a chunk, leaving out pointers altogether and replacing the "head" pointer with an integer indexing into the array.

    In short, programs should be seen as specifying desired results, or abstractions -- and the underlying system seen as free to find the most efficient method of achieving those results that it can, whether or not the method bears any resemblance to the algorithm outlined in the code.

    And right now that's what we're failing to do. When each feature of an abstraction adds its own cruft, the abstraction becomes "too expensive" in performance to use for a simpler task.

    Partial Evaluation

    Because our programmer doesn't ever use the back links, the code to keep them updated ought not be generated. Also, the declaration of the back links in the datatype ought not cause memory to be allocated in the nodes.

    Because he's storing constant-size integers, the indirection to potentially variable-size data should not happen. Instead, the integer should be stored immediate, in the list node.

    Apologies for being terse -- I'm in a bit of a hurry. It sounds like you're looking for serious progress on partial evaluation and/or staged compilation. I'm with you on that, but I wonder if it won't require serious uptake on dependent-types first (or perhaps it could go the other way).

    like natural language or like mathematics

    Any talk about a "last language" pretty much depends on programming languages being like mathematics and not natural languages.

    I'm not so sure about that.

    I'm not so sure about that. It's not like mathematical language is static. In fact, programming languages are (arguably) just a proper subset of all mathematically-derived languages. We got here from Frege's crazy logic diagrams, Principia Mathematica, Goedel numbers, and eventually to C++, Java, etc.

    Speaking of Goedel -- perhaps all of this talk of the "last language" mirrors the debate about the foundation of mathematics 100 years ago (and will meet with similar ends).

    Mathematical?

    It seems to me that calling the likes of C++ and Java "mathematically-derived" would make the cited mathematicians scream in agony. I have never seen anything remotely assembling a mathematical definition, especially not for the former. Nor do I think you could ever provide one that would even have the slightest chance of meeting mathematics' most basic standards.

    Formal is forever incomplete

    A counterpart of your remark is the opinion that a formally defined language would necessarily be "not enough" on some aspect, breaking the idea of a last "complete" language.

    This is formally true for languages with consistent static type systems (in the extended sense of "everything that can be said about a program without running it"), but this may or may not be extended to other aspects of "language formalisms". For example, it seems harder to characterize how dynamic semantics may be "incomplete".

    I think we need formal definitions for our programming languages, and I postulate that the clarity of those formal definitions will always allow us to see that there is something missing, unexplained or badly understood, and that we can go further (in the situations where it is necessary). So definitely no last "mathematically-inspired language".

    incomplete or inconsistent

    only applies to a formal system that can perform the arithmetic required for "Gödelization" that is used in the proof. Assume we have a formal system for defining other formal systems. Call it the meta-system and the systems defined by it are object-systems. The fact that an object-system can peform arithmetic and is thus subject to the incompleteness theorm does not imply that the meta-system is also subject to the incompleteness theorem.

    Arithmetic is a red herring;

    Arithmetic is a red herring; it's only used to encode data structures necessary to formalize a host version of the system inside the system itself. Even if the meta-system can describe itself -- and it better can, or its severely limited as a system description tool and thus unsatisfying -- it cannot give strong guarantees about its description (consistency, termination, whatever). So you need another, more powerful system to reason about it. I don't think there is a workaround.

    The point of a formal system is not to "prove everything"; we know it can't. It is to be a good framework to work on the interesting things we want to prove. Trying to go "as far as possible" is not necessarily the most interesting things; there are hard problems to be solved even in the systems of today. You only need to go as far as your current objects of study dictate.

    To take a concrete example, the "module system" question is still unsolved : most languages have module system, but some, mostly in the ML family, try to formalize and statically verify the use of the module system. This is a very difficult problem and the current "mainstream" solution (SML and OCaml in particular) are not satisfying. There has been good research in this direction, but we still don't have a compelling picture that would make you say "ok, we'll put just that in the next ML/Haskell"; because it's hard. And to study and work on that, you don't need an incredibly powerful type system; you can work at the F-omega level, possibly with singleton types (which are very, very restricted dependent types), but you probably don't need the powerful features of today "big" type/formal systems with full dependent types, universes, etc. It's about precision, not power.

    self-definition is insufficient

    The property you want is not self-definition; it is self-reference: the ability of a formal system to make statements about itself from within itself. That is what is at the heart of the two incompleteness theorems. But a language and programs written in that language are generally considered distinct entities (for example most people don't consider writing a program to be the same as extending the language) Hence the emphasis on axiomatic systems in the two incompleteness theorems where what is being defined is defined within the language rather than external to the language thus providing the required reflectivity and enabling self-reference.

    mathematical language is now relatively static

    I'm not saying mathematical language didn't evolve, I'm saying it isn't going to evolve much further; especially "core" mathematics that are used across virtually all mathematical and engineering disciplines. And the conformity is almost total at the semantic level with only minor differences at the syntactic level. Obviously out at the fringe of mathematics there are new ideas and new notations and a lot less agreement but focusing on that misses the point.

    We really need to distinguish between "programs as languages" and a "programming language".

    A modern programming language typically provides the ability to define new abstractions as well as a set of predefined abstractions. A program defines a language in terms of a set of abstractions and then says something in that language. For me the essence of a programming language is its ability to define new abstractions and not the predefined abstractions which just provide a common sub-language for programs. From this point of view a programming language is really a meta-language for defining other languages. This idea is not the least bit novel as people have been pointing it out for decades.

    So from my point of view if you are talking about a "last language" then this implies a focus on the meta-language role of programming languages. So a "last language" would need to answer the question of how we define abstractions and their relationships but need not say anything about what abstractions we should define.

    Ha! So well put

    Ha! So well put, in your last paragraph. I'd only add:

    and I fear that ain't gonna be easy, for it seems to me very few of us bother even to mention these issues or express strong enough concerns, beside people like you, or dmbarbour, or etc.

    Advancing the notion of language.

    The pre-scientific evolution of human language was under no obvious pressure to select features oriented to formal languages, suitable for the investigation of abstract concepts in mathematics and computer science. To suggest that the language space in those fields has been more or less explored, also suggests that the conventional tree language is adequate as the main basis for developing formalisms and programming languages. I strongly question that view.

    An arboreal language is one whose syntax is tree based, in which most if not all branches of the syntax tree may be arbitrarily deep (resulting in trees with high structural variability), and in which there is no explicit notion of abstract memory. Natural languages, and those in Chomsky’s hierarchy are arboreal. There are at least four aspects of arboreal language, that render it problematic as a template for formal languages, and parallel programming languages in particular.

    Firstly, arboreal syntax trees can structurally only directly link one part of speech, to at most one more complex part of speech, a feature I term the Single Parent Restriction (SPR). Arboreal trees cannot syntactically express the sharing of subparts of a part of speech, and hence cannot encode many to many relationships in a concise way. Either subexpression repetition is required, which discards a potential logarithmic compression in representation size, or a semantic notion of storage needs to be introduced. Graph based languages bypass SPR, but are delimited by requiring the solution of the NP-complete subgraph isomorphism problem for the application of a rewrite rule.

    Secondly, arboreal language historically has a serial character in that a transmission of the shortest string of linguistic entities capable of expressing a basic relationship, can only describe a single relationship between objects. Even if extensions are added to a tree language in order to describe parallelism on the syntactic level, as is the case with most parallel programming languages, SPR complicates the expression of shared structures in parallel processes. One factor obscuring the limitations of arboreal languages until recently, has been that there was never a need for the development of cues and mechanisms for many basic sentences/relationships to be semantically processed at the same time. A language system designed from scratch to be transmitted and processed non-serially, is more likely to be able to support a coherent form of simultaneous semantic processing.

    Thirdly, arboreal trees exhibit a high degree of variability, where any individual branch may be arbitrarily long, requiring a complex parsing phase before semantic processing. Within the context of parallelism, it is also difficult to access and process multiple parts of an irregular syntax or semantic tree at the same time.

    The fourth factor relates to parallel computation. In sequential environments, it is the sequence of operations and data transfers that are principal in defining a sequential algorithm. In parallel computing, the placement of operations and data transfers in the machine environment, is also important. Arboreal trees alone do not express spatial information, indicating data transfers and resource allocation within an abstract machine environment at the syntactic level.

    To argue that these factors can be dealt with by plumbing code or “under the hood” dynamic semantics, is questionable, because of the well attested issues that have been encountered in trying to make arboreal parallel languages work well. A spatial approach to language snd computation, which bypasses the above factors was discussed here. I think the approach has the potential to support Robb Nebbe’s list of features mentioned above (with the exception of the stipulation against gotos), and is also close to the metal, as suggested by dmbarbour.

    Programming languages syntax is not tree-form because of binding

    Firstly, arboreal syntax trees can structurally only directly link one part of speech, to at most one more complex part of speech, a feature I term the Single Parent Restriction (SPR). Arboreal trees cannot syntactically express the sharing of subparts of a part of speech, and hence cannot encode many to many relationships in a concise way. Either subexpression repetition is required, which discards a potential logarithmic compression in representation size, or a semantic notion of storage needs to be introduced.

    Programming languages are not "arboreal" according to your definition, because of the name bindings. When you manipulate a program, you have to take the binding structure into account, and this is one of the reasons why most programming languages -- say, lambda-calculus with names represented as strings -- cannot be described with a CFG only.
    The binding structure is richer than that, and you cannot guarantee the processing, transformation and production of only well-formed programs if you're only equipped with bare CFG-manipulation tools.

    Arboreal is a syntactic feature only

    Arboreal-ness refers to syntax only. The lambda calculus is defined by a CFG. Name bindings exist on the semantic level, surely.

    ... and Muhammed is his prophet

    I recommend reading The search for the perfect language from Umberto Eco, which is, among other things, about a centuries long search for the language of Adam, which was neither "natural" nor "mathematical" but surely well designed.

    Today the "last language" is usually a topic for programming newbies, idealists and kooks. It doesn't mean of course that no one shares the ennui sentiments of Bob Martin. Few people expect another "cambriam explosion" in language design or that after a period of purification and asceticism we are reborn in an incestuous paradise where everything works with everything else.

    BTW I wonder if the ambition of singling out a language as a "last language" is even politically correct? Users of other than the last language would inevitably be discriminated. What about diversity? Isn't this our ultimate value?

    Black Swans

    | Few people expect another "cambriam explosion" in language design

    The idea of a last language is indeed a bit silly. But trying to make language environments less imperfect in a general sense is valid, because it could result in Black Swan moment, leading to a cambrian explosion in language design.

    I do expect a new Cambrian explosion

    If you go to POPL, say, and ask around, you will find a significant number of people there who are of the opinion that if you have Coq and a macro assembler, then you don't need programming languages any more. This is not a majority opinion, but it's a pretty big minority.

    I don't entirely agree with this position, since I think languages arise from fundamental semantic structures. But I do think it makes a lot of sense to explore building languages on the Coq+MASM substrate rather than with "compilers" or "interpreters". That is, we maintain invariants by stating them and proving we maintain them, rather than by elaborately circumscribing our language so that it is impossible to say things that break invariants. In categorical terms, the idea is to take a semantic category, and work externally with the morphisms of the category, rather than internally in the internal language of the category.

    Of course, any such fundamental change to the infrastructure of language design means that a lot of things we took to be fixed truths aren't, and that could shake up language design quite radically.

    More

    Could you elaborate a bit more here? I find that very interesting. The way I understand your post, It makes me think of Adam Chlipala's work on Bedrock. Do you have other references?

    I'm also not sure what you're describing in your "categorical" remark ("working externally with the morphism ..."). Are you speaking of a "low-level" category of meaning and describing denotational semantics as a translation/compilation process, or thinking of something else?

    Checking my understanding

    So this big minority is saying that you basically prove your program in Coq, and then spit out custom assembler code for it?

    MASM?

    Serious question: why do you need the assembler?

    You don't need it

    I just said it that way because I thought it sounded punchier. :) For real no-fooling all-the-way-down verification, you really do want to verify the actual machine code. (E.g., see Magnus O. Myreen's PhD thesis.)

    That is, we maintain

    That is, we maintain invariants by stating them and proving we maintain them, rather than by elaborately circumscribing our language so that it is impossible to say things that break invariants.

    Sorry, I can't parse this sentence. Do you intend to say that we will build the type system first, prove it and then go on designing the language instead of bolting it onto an untyped language?

    The converse

    It's not exactly the converse of what you say, because it cannot really be described as starting from an "untyped language".

    A common trend among type-system lovers is to design type systems of increasing complexity that capture finer aspect of the language that the classic type systems. For example, one would design a type system such that only polynomial programs are well-typed, or such that only privacy-secure programs are well-typed. Once you have such a system, your language is limited to "good programs" -- but the drama is that some good programs must be left out.

    What neelk describes is the idea that instead of bolting those elaborate analysis in the type system, you capture the invariants they guarantee (the "do not go wrong" property) and prove that they hold for each fragment of a program in an ambient less-typed -- but not necessarily untyped -- language, possibly using analyses similar to the previous type-system (eg. witnessing that the fragment admits a type in this system), or by hand-crafted proofs. So you write programs that do not necessarily respect those elaborate well-behavior conditions, and prove them afterwards.

    With the advent of proof systems and proof automation, this approach of proving good properties afterwards become tractable. One nice thing with this idea is that you can hope to compose different type systems for different part of the program -- if they invariants are indeed composable.

    I suppose the counterpart of this vision are the people promoting that type systems can be more than a helpfully restrictive framework, and also help you write typed programs through automation and all that. See for example Conor McBride's position: "types might have an active role to play, structuring the process of program inference".

    No?

    To my read, he means almost the opposite of this, but I may be writing between the lines, since this is basically the approach I'm taking with my own language. At a high level the language consists of:

    - A powerful underlying logic (e.g. Coq)
    - Axioms describing machine features
    - A customizable "skin" specifying both syntax and what the syntax means in terms of the underlying logic

    I think this third bullet is important - no one actually wants to write code in Coq. And BTW, I think some language of this form will eventually become the "last language," in the sense of underlying the new languages that people build in the future.

    One nice thing about such a system is that you can 'bolt on' new type systems fairly easily. Type systems can be proven sound (they propagate true propositions), but they needn't be complete in any way. Even if there isn't a typing rule to get you from the type you have to the type you want to have, you can always drop down and prove the correctness of your cast in the ambient logic which defines what the types mean. Thus instead of working only with "morphisms in the internal language of a category" through type checked terms we allow ourselves to build morphisms externally.

    "build morphisms externally"

    Could you detail more precisely what you and neelk mean by "externally" here ? The way I understand it, it means defining "low-level" mathematical objects, and then afterwards proving that they are indeed legitimate morphisms in the "high-level" category. Like you define a function on natural numbers, and afterwards prove that it is a monoid morphism from (N,+) to whatever. Is it what you mean, or something completely different?

    Something confusing with this terminology is that, in category theory, if I understand correctly (I'm no expert), "externally" is often used to mean "observationally" (defined in term of its interaction with other objects/morphisms), which is completely different and perhaps even contradictory.

    ""build morphisms externally""

    Yes, that probably should have been in quotes in my post as well, since I was parroting Neel. My category-fu is weak, but my understanding of what he is saying is that you have a semantic category that defines the meaning of the subject category's objects and morphisms, and then, rather than using the natural type system for the subject category, you prove in the semantic category something to the effect of "there exists this morphism in the subject category such that ...".

    So your example of proving a function to be a monoid rather than building it up from primitive monoid operations is probably a good example of what I'm talking about and I think it's also what Neel is talking about. I can't comment on your second point.

    Yes, you understand what I mean

    This is exactly what I mean. "Low-level" in this sense might be literally low-level, such as inline assembly.

    The idea is that the really important things are the invariants that the runtime system and the rest of the program expect, and it doesn't really matter how the binary code that respects them got generated. Maybe they were written in some typesafe language and compiled, or maybe somebody just wrote some assembly and did a correctness proof. If the invariants are maintained, there's no reason to care.