Why is there no widely accepted progress for 50 years?

From machine code to assembly and from that to APL, LISP, Algol, Prolog, SQL etc there is a pretty big jump in productivity and almost no one uses machine code or assembly anymore, however it appears there's been almost not progress in languages for 50 years.

Why is that? Are these the best possible languages? What is stopping the creation of languages that are regarded by everyone as a definite improvement over the current ones?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

re: No Definite Improvement

The essential issue: machine code, such as x86, is such an AWFUL programming language (from perspective of a human society) that it's TRIVIAL to improve on it in every imaginable way - safety, reusability, reentrancy and concurrency, consistency, composability, modularity, portability, debuggability, testability, optimizability, extensibility, etc.. For a broader list of potential dimensions for improvement, see the "list of system attributes" and "cognitive dimensions of notation" on wikipedia.

However, PL designers are usually pretty smart and not too rushed, and those from today aren't any smarter than those from fifty years ago. At best, we have a different perspective and different priorities (e.g. due to memory access becoming more expensive, concurrency becoming more widespread, access to FPGAs and GPGPUs and cloud computing networks, introduction of Open Source package distribution models vs. hide-everything-to-protect-IP, etc.). So, after a PL is designed, developed, and has matured a little through actual use (at which time we can assume there are no obvious incremental improvements), it becomes a lot harder to improve on it in EVERY way. Instead, different PL designs constitute different tradeoffs between system properties, or different visions of the system, or just the different aesthetics of PL designers.

Which isn't to say that finding a better-all-around language is impossible. But it will likely require a revolutionary vision of 'the programmable system' (e.g. the sensors, actuators, storage, and networks) and how programmers/users should extend and interact with it. The idea of PL and UI being separate things, or of applications and services being walled gardens, for example, are not essential.

I've got a little list...

Interesting. If I'd tried to make a list of imaginable ways to improve a programming language, readability would likely be at the top of my list; if, indeed, I went on to attempt a lengthy list rather than, say, just remarking on the desirability of readability. Thought-provoking, that readability wasn't on your list (sic: neither good nor bad; thought-provoking). To my mind, the purpose of a programming language is to express what we want done, hence lucidity is a core value.

re Readability

Readability is not a property that I would attribute to a language, because it's as much a function of the reader (familiarity with language, libraries, patterns, problem domain, etc.) and tooling (syntax highlighting, search, history with semantic diff) as of notation and semantics.

But those cognitive dimensions of notation do contribute to readability. I mentioned one of them, consistency, then the set as a whole.

Aside: There is also much that tooling can do to enhance a human's ability to explore and understand a codebase, e.g. good fonts, syntax highlighting, jump-to-definition, progressive disclosure, in-place tutorials or spreadsheet-like unit-tests. How much should be under the umbrella of 'readability'?

Exactly - The Bus Test

We can conceive of "Programming Languages" many ways. Many think of PLs as "just a tool to get the hardware to do something." But others conceive of hardware as "just a disposable tool to make programs go."

In my own focus on team engineering, I've frequently taught about writing code that "Passes the Bus Test." Ask of any library, function or unit: When any team member gets hit by a bus, is their code written so that the rest of team keeps proceeding as smoothly as possible?

In that context, I've fired many "over clever" programmers. Dysfunctional personalities are everywhere, and so too in development teams. Some are "tricksters" and "class clowns." Others are technique fetishists with chips on shoulders. Yet others imagine "gold stars" coming their way for needless cleverness. Still others "minimize their keystrokes" at their desks. On my teams, none of that was acceptable, and I made all those pathological cases clear.

These problems are not unique to PL use. In Systems Analysis, we learn how some employees guard their jobs as "the only ones who knows how to find anything in their filing cabinets." Those are liabilities to every organization, but they are very real. Better Systems Analysts are good at finding and replacing them.

In some linguistic disciplines, we share reasonable standards for "how to write readably." That often includes documentation standards. But it's nothing we can assume.

In sum, 85% of the time, individuals never matter. Teams hit their targets or they don't, as individuals come and go.

So back to PL, semantic symmetry is the goal. With that goal, NO "contests" in how the team PL is used are ever acceptable within a team. Better unfamiliar but useful techniques must be teachable and taught, thus "passing the bus test." Some of this should be familiar from "Egoless Programming" and "Pair Programming" doctrines.

Writers write for readers. A children's book or cookbook is not "made better" writing in the style of Pynchon's Gravity's Rainbow.

Why is there no widely accepted progress for 50 years?

Because we're clueless.

Don't ask why until you answer whether

There's been plenty of progress on many topics related to PL evolution. It's quite silly to suggest that nothing has progressed in 50 years.

That said, achieving noticeable gains becomes harder in each generation, because the bar of entry gets raised fairly dramatically over time. It used to be that you could create a breakthrough language with amazing whiz-bang features, and the entire compiler and/or runtime would be a few thousand lines of code -- small enough that one person could hold the entire thing in their head.

No longer. Such a language today would be considered a toy, even by the low standards of a college student working on a term project.

Toy languages

Just taking one point there that I found thought-provoking: I suggest the perception that anything simple must be a toy may be a consequence of the paradigm we're imposing on the subject.

This aspect of scientific paradigms can be tricky, so pardon if I repeat familiar territory (from Kuhn's The Structure of Scientific Revolutions). Paradigms at best are immensely valuable because researchers within the paradigm can apply laser focus to exploring the research space within the paradigm, spending absolutely zero resources worrying about alternatives to the paradigm. A tremendous advantage at best, this inevitably suppresses alternatives to the paradigm even if they have merit. Iirc (can't instantly conjure it up), Kuhn remarked in the book that at the time a paradigm is adopted there is already ample evidence that it's wrong. How useful is it to focus research this way? Seemingly, as long as it is advantageous to continue to explore the space of research within the paradigm, the paradigm should continue to be useful. One of the things that can go wrong with this pattern is that a paradigm may continue to look plausible from the inside, while standing back from it one may suspect that really we might do well to look for a better approach.

At the level where we perceive anything simple as a toy, it seems we may really have just one paradigm for programming languages, despite the many so-called "programming language paradigms". If we really need to break out of this paradigm (or mindset, whatever you call it), the thing we're looking for may indeed be extremely simple. (Recalling, notionally, (profundity index of idea) = (difficulty of discovery) times (difficulty of finding a really good explanation) times (obviousness once explained well).)

Finally, a chance to use the word profundity

I do believe you are onto something, and yes, I think that is what I was getting at. In some ways, our industry has repeatedly converted the profound to the obvious, and as we have accreted those advances, the cost of incremental profundity has grown exponentially.

What we've learned

In about the mid-1990s, I tried to make a list of ideas we'd come up with about how to use computers that seemed likely to still be perceived as important ideas in another century or two. I came up with two items:

  • The idea of an operating system.
  • The distinction between control and data.

Operating systems aren't really my thing; they did strike me as pretty fundamental, though, so I listed them.

Some folks like to try to collapse the control/data distinction, but I'm not convinced that helps. The distinction between action and participant recurs wherever you look: control and data, verb and noun, energy and matter. Closely related also to time and space. Yes, there are advanced theories that tamper with each of those distinctions, but in practice they're all useful distinctions.

I never did come up with a third thing for my list, though I struggled with it, and have revisited it from time to time in the years since. What else dominates the whole field? I share the sense that we're not making progress; if we're not, something that dominates the field might be holding it back. The other thing that occurs to me —and that I've never been sure enough of to put it on the list— is types.

I might add Composition and Environment

I might add:

  • Composition: hard to see how composition would ever go away. Build larger components by composing smaller components.
  • Environment/context: Every computation runs in an environment, and many recent advances have been about moving the environment closer to a first-class value, ie. scoped bindings in languages, hardware/software virtual machines, delimited continuations.

FCVs

Thought-provoking additions to the list. I think I agree about environment/context, which, though its roots may belong to logic, seems to have something of the stamp of programming upon it. Honestly unsure about composition, which might be deemed of an earlier vintage, so not exactly a programming notion. Tracing the origins of foundational ideas can be tricky. E.g., our modern notions of function and variable go back, afaik, to Frege in the latter half of the nineteenth century and belong to logic rather than computation, though admittedly refined by Church who is more of a transitional figure (a former student of his later described him as "logic incarnate"). Now that you've directed attention to these sorts of ideas, another with clear PLT origins that I've lately come to suspect may be crucial far more widely (in mathematics both pure and applied) is first-class value.

Ah, you're looking for

Ah, you're looking for concepts that originated from computer science. First-class value is definitely a good candidate then. I think computing merely refined our understanding of the other concepts I mentioned. Other ideas:

  • Computational power: the computational power of a system being designed or used as a blackbox can be a meaningful property, ie. whether it's domain-specific/purpose-built or general purpose/Turing complete.
  • Encapsulation: some distinction between inside and outside views of an entity seems inescapable, although there is some relationship with environment/context here. Every computer clearly has an internal view that we can only interact with via an external facade.
  • Computation's close connection with logic, inference and synthesis: we have type inference and program synthesis (extracting OCaml programs from Coq proofs), and so on. I expect inference/synthesis to become increasingly important over the next century.
  • Induction and Coinduction: induction has received a lot of attention and originated outside of CS, but coinduction is only just starting to receive attention and arguably came from CS. Having an abstraction and its dual seems pretty important for expressiveness.

Maybe Device Drivers?

Perhaps you include the 1-to-N API-to-Device Drivers in your OS, but it's a major leap forward. Microsoft's relegating printer drivers to the OS behind a printing-API put several pre-Windows word processing companies out of business. Writing code to custom hardware remains important in some applications and subsystems, but when it's not important, it no longer needs to be important.

Perhaps we can credit Liskov et. al. in PL theory with these advances?

It isn't what you don't know ...

If it was easy or obvious I expect we would have seen more progress. I would guess that it isn't what we don't know that is holding us back but what we think we know that isn't true.

Unknown unknowns

Seems likely. This also seems related to my predilection for exploring exotic hypotheses that challenge conventional wisdom, and perhaps to why my list of fundamental things we've learned is short.

Conjecture: A Language Stack?

In a PL conceived as a Language Stack, each lower layer of the Language Stack provides an operational semantics for the layer above.

We might cobble together prototypical examples from Asm -> C -> Utilities -> Shell Pipelines from *nix. In C compilers, we might see embedded asm {} constructs as examples of "dipping lower in the stack from a higher layer."

Or similarly, where interpreted languages access "lower on stack" C libraries via Shared Library/DLL FFI.

Such a Language Stack would not be a playground for explorations (also a good thing, but a different thing), but more like One Stop Shopping for writing libraries and applications at the appropriate levels of abstraction/performance.

Reflective tower

This puts me in mind of the notion of "reflective tower", introduced by Brian Smith's reflective 3-Lisp. If you're not familiar with it, amongst possible points to break into the subject are the 1988 paper "The Mystery of the Tower Revealed: A Nonreflective Description of the Reflective Tower" by Mitch Wand and Dan Friedman (current link), or Brian Smith's 1982 dissertation, "Procedural Reflection in Programming Languages" (current link).

(It's a bit disconcerting to me that, atm, not only does Wikipedia have nothing about 3-Lisp, but Brian Smith's Wikipedia page doesn't contain the string "Lisp".)

Exactly! With a few purpose-driven constraints

You read my mind. Racket, ST-80 and Forth were the primordial soup of which my dreams dream of something more diamond hardened and "cathedral like."

First, presume production application targets: payroll, inventory, telecommunications ("billing systems," as they call software here in Northern Virginia), accounting, actuarial systems, credit scoring, policy decision support, embedded control systems and so forth. We'll need "stack layers" crafted for production targets.

Second, presume life/death and/or fortune/loss are always at stake in our production targets, and thus at stake in the compiler technology and language design itself. I'll come back to this below.

Third, conclude how ZERO programs "written once for throwing away" are ever candidate application targets (or thus, design targets) for our Language(s) Stack. bash or guile or gawk or perl already suffice and need never go away.

Presume ASM for ARM, x64, PPC, etc. are at the "base" of the stack, in their resledant particularities. We'll not have 1-language, but instead, a "stack" of languages with clear operational semantics between them, very much on purpose.

We might envision a "bootstrapping meta-circularity" ONLY at some low stack Forth-like language also suitable for laundry machine control, anti-lock brakes, lunar landers and other embedded systems. That gets us to a C99 or Rust style language, then to a Typed-Racket style language, and so on "up the stack" to SQL and some modernized MUMPS for the medical industries.

One higher-stack goal is achievable parallelism, akin to language level pipelines or some other data-flow programming models, to use up some of our embarrassment of multi-core riches.

Returning to safety ethics of life/death and/or fortune/loss, I invite anyone to Google for years of research efforts lost (or worse) to sloppy Excel spreadsheets of data and/or "programs." That's not OK. Where that impacts millions of misbilled customers, exposed credit information or mistreated ill, it's even less OK.

To date, only limited liability corporate legalisms and outdated social conventions regarding "researchers" excuses such sloppy bad self-aggrandizing practices. I counsel strong typing, type-conversions over type-coercions and explicit numerical tower types (and constants), including fixed decimal types appropriate for financial applications.

Returning to safety ethics of applications (not language semantics), we should dream of human verified (thus verifiable) applications substituting for sloppy work hiding behind "trade secrets" and "best practices" pure fictions.

What would that mean? We should aim for tools and applications where judges and juries might credibly put executives and programmers behind bars for reckless endangerment of customers using computing tools. How is that for a goal?

We have far too many toys. We need tools. Then we can put our toymakers to accountable and productive labors, where they belong.

Ogres are like Onions

Sounds like you’re after Piumata/Kay’s Maru/Nile/Gezira work:

http://www.vpri.org/pdf/tr2007008_steps.pdf
http://www.vpri.org/pdf/tr2011004_steps11.pdf

Gezira, a graphics library in <500 LOC, implemented in the Nile language, which is implemented in the Maru language. Layers on layers. Some links to explore via the orange site:

https://news.ycombinator.com/item?id=10535364

As an amateur crafter of end-user languages I have a fondness for Seymour Papert’s Logo, a Forth/Lisp hybrid that cultivates compositional over algorithmic thinking. A much deeper, more “growable” language than superficial appearance would suggest. My own WIP on a modern Logo-like is not at Nile’s level but might be marginally notable, if only because there isn’t more happening in this area. Everything’s pluggable, including syntax and behavior; a natural DSL toolkit.

Two of my favorite books on

Two of my favorite books on programming, SICP and Kernighan and Pike's The Practice of Programming, both advocate this problem-solving-is-language-design approach, going much further than the RoR-inspired "DSL moment" of the mid-Aughties, which focused on processing passive domain-specific data more than implementing full languages. These books embody the ethos of the yin and yan of computing, the MIT and New Jersey schools; I try to expose people new to the field of computing to both, not least of all because they converge from two very different sets of priors.

Inverted Language Stacks and Accelerators

Language stacks inevitably have abstraction leaks. The lower level language leaks into the higher level one, whether by explicit embedding or various design assumptions. This isn't necessarily a bad thing. It depends on the quality of the lower-level language.

Unfortunately, most lower-level languages upon which we've historically built language stacks (ASM, C, etc.), although good for performance, are problematic in many other aspects, such as reading, explaining, accessing, debugging, porting, composing, extending, adapting, sharing, securing, persisting, multi-processing, distributing, caching, optimizing, heterogeneous computing. At the end of the day, the tower of languages topples due to its poorly designed foundations.

An intriguing alternative is to invert the stack.

Start with a conventionally high-level language that has many nice properties except performance - composition, concurrency, reflection, automatic memory management, natural numbers with bignum arithmetic, caching, optimizing, etc.. Kahn Process Networks or Lafont Interaction Nets are viable starting points.

Model low-level abstract machines within the high-level language, albeit favoring Harvard architecture or other structural behavior-data separation. We could model an abstract register machine with binary memory and fixed-width integers and floating point. We could also model mesh networks, GPGPUs, FPGAs, etc.. This doesn't need to be purely functional, depending on the how the higher-level manages effects.

Annotate subprograms based on these models for acceleration. The compiler can recognize the annotation, validate the subprogram, then replace by actual processor. The behavior-data separation simplifies this substitution, e.g. translation of abstract machine code to x86 might be performed up front.

Such a language stack supports appropriate levels of abstraction/performance, but upon a more robust foundation.

Brilliant! Love it.

I feel like maybe a fog has lifted. My first self-critical after thoughts are that my own mind has been clouded by my (not unique) admiration for ST-80 and FORTH designs based on towers built upon "primops."

I'll admit, I had to look up Kahn Process Networks :)

Inverting the Language Stack

Yes, absolutely this!

In order to do this, in order to invert the language stack, you need to separate the conceptual stack from the implementation stack, because with the latter we obviously have to have the low-level stuff at the bottom.

Smalltalk did a great job with this at the level of the types, the class hierarchy. Where in something like C the conceptual roots are the implementation roots (integers, arrays), in Smalltalk the conceptual roots are abstract: Object, Magnitude, maybe Number. Collection.

The implementation roots are very specific kinds of these abstract concepts, for example SmallInteger or Array, that do not correspond to the conceptual roots. In fact, they tend to be tucked away at leaf nodes of the class tree. ( Object >> Magnitude >> Number >> Integer >> SmallInteger) and Array ( Object >> Collection >> SquenceableCollection >> ArrayedCollection >> Array ).

Note that the implementation roots tend to be optimized, "accelerated" versions of the concepts they represent. This can certainly done at other places as well.

However, this is not the case for the metasystem, the actual "language stack", which isn't just not inverted, it's actually downright monomorphic in Smalltalk and pretty much every other language out there.. CLOS is one of the few exceptions, it added a bit of polymorphism to the metasystem (see The Art of Metaobject Protocol), but that bit is really not that large.

It always seemed to me that we should apply this Smalltalk-style inversion (and decoupling) of the type-system to the language itself, for similar benefits. The question then becomes: what is the analog to the (almost) fully abstract "Object" of the type-system in the language-space?

While some might say the lambda calculus or some-such, I don't think they fit the bill, because although they can be regarded as foundational, they are actually quite specific. Because if they weren't specific enough, they'd have a hard time being foundational.

What I came upon are the basic concepts from software architecture: component and connector. These fit the bill in being abstract enough that they don't yet impose any kind of specific implementation except that you have something that somehow interacts with or is connected to something else. On the other hand, they actually do seem to be a good enough basis that pretty much all computational models have been categorized as specific kinds of components and connectors.

With that as the conceptual meta-model, you can start incorporating all sorts of useful abstractions as first class entities, including for example dataflow or array programming. As these are less general than generic call/return programming, they lend themselves to optimized implementations without additional out-of-band mechanisms such as annotations, though separate optimization schedules such as in Halide also seem like a good idea.

My current implementation of these ideas, Objective-Smalltalk (or Objectve-S? Currently considering renaming) has focused more on the expressiveness side of the equation, but performance has always been in the back of my mind. Particularly the dataflow-ish parts has also already had positive performance effects.

Metamodel for a Language System

(Aside: link repair - Objective Smalltalk.)

The meta-model I'm exploring focuses on data rather than programs.

Quick summary: Modules compute values. Programs have homoiconic representation as structured data. Files have user-defined syntax, guided by file-extension. To process `foo.xyz` we seek module `language-xyz`, whose value must represent a program. Computed binaries can be extracted. Compilation is computation and extraction of an executable binary based on values that represent programs.

My vision for build systems is that they should include use-cases such as constructing ray-traced images and videos, music synthesis, packages that represent databases or catalogs (maps, fonts, etc.), physics simulation, machine-learning. Further, computed binaries including executables can also be imported for inspection, simulation, fuzz testing, integration testing, or further composition (e.g. into a zip-file or virtual-machine container). We should be able to manually target code for heterogeneous contexts, e.g. a mixed web-server and JS+DOM web-app.

I find it terribly awkward that today we cannot easily import and manipulate software artifacts the same as import code, and vice versa. To cover many use-cases, the program model is general purpose, carefully designed for scalability, composition, comprehension, and caching. I intend to use acceleration to support high-performance computing without sacrificing other nice properties. However, the only essential client is my language modules. Users can define and compile alternative program models, and build syntax around their alternative models.

You state above that "you need to separate the conceptual stack from the implementation stack, because with the latter we obviously have to have the low-level stuff at the bottom". And of course I'll need some interpreter of the program model and a lightweight bootstrap syntax. But I feel it's worth noting that we aren't stuck with a 'stack' pattern/architecture. There is no specific target below the language module's program model. And user-defined languages won't need to embed or target the language module's program model, just treat that as another heterogeneous target (for static computation) together with x86 and JS+DOM.

My goal is to have a much nicer foundation for language systems - one that makes it easy to control dependencies, support caching and incremental compilation, support enormous distributed builds, integrate machine-learning, automate testing, support heterogeneous targets, extend and manipulate syntax, etc..

It's because capital, probably.

it appears there's been almost not progress in languages for 50 years.

Why is that? Are these the best possible languages?

From a historical materialist point of view, such a question requires inquiry into the underlying mode of production. Surely, dramatic improvements in programming languages are still possible from a human productivity standpoint, yet they are not aggressively pursued, so we should look at the current role of software in political economy.

Some speculation:

I would note that the big tech giants have two properties:

(a) They have world-historic levels of income per labor hour. In fact, they have many employees doing purely speculative work that never produces revenue, nevermind income, so if pared down to just their productive core, the tech giants would have even larger "labor productivity" figures, in conventional accounting.

(b) They operate in tech markets that are relatively free from competition. They only really compete against the average rate of profit across all sectors. In that competition, they are hands-down champions.

Relative to the size of revenues per worker hour, the high salaries afforded any modestly competent programmer are peanuts. Their "labor productivity" is that high. (So much of this is about selling ads, so it has only a weak relation to the quality of these monopolists' code, user experiences, etc.) Moreover, persons who are amply qualified at a technical level are in such abundance, that hiring decisions tend to revolve around other factors.

In other words: the productivity of the "average programmer" at these dominant forms is not in any way a bottleneck to profit making. At the same time, these tech employers have absolutely no economic pressure to economize on labor for any other reason. Therefore they lack incentive to directly or indirectly invest, or call for investment, at next steps in programming languages.

Lastly, it's not like everything just ground to a halt in 1970. It's been a long decline. It is hard to imagine the trend reversing anytime soon, so long as the primary purpose of increasing programmer productivity would be economize on labor in profit-making endeavors.

Any promising idea for a big leap innovation in programming languages is likely stranded for want of support and opportunity to develop the idea and test it out in practice. In our society, people have to give up their time to sell labor where it is being bought in order to live. So only a very tiny subset of the population with free time could possibly spend a lot of time on breakthrough programming language ideas.

sic et non

If I follow you, you're suggesting that PLs haven't improved much because our corporate overlords have no profit motive to do so. This probably has some degree of truth in it, but I offer this caveat: the suggested reasoning supposes that PLs only get improved if the corporate overlords say so. I'd be less skeptical about a claim that PLs don't get widely adopted unless the corporate overlords say so. But the idea that people in general (not limiting oneself to the pathological case of corporate overlords) are motivated solely by profit is capitalist propaganda. On the contrary, society cannot function on a pure profit motive, which is exactly why capitalism only works on a small-to-middling scale; small businesses leave sufficient room for decisions by people who aren't mentally ill. As for the role of corporate overlords' pathologically profit-oriented decision-making in stifling improvement of PLs, several questions come to mind.

Lack of corporate funding admittedly would mean fewer, and smaller, initiatives toward improving PLs. However, how much would this retard progress? What kinds of low-hanging fruit would be altogether untouched just for this reason alone? What kinds of low-handing fruit would be attended to anyway? And just how much, or how little, would lack of corporate funding really retard deeper research? Some kinds of science require massive investments, like supercolliders hundreds of miles across, but it's not obvious that PL design would be of that sort.

Then there's the question of what happens when someone does come up with a significant improvement to PL design. I don't favor conspiracy theories about oil companies ruthlessly (or ruthfully, for that matter) suppressing somebody's discovery of a massive source of cheap power, or the like. If somebody comes up with a really obviously superior improvement to PL design, without a downside, wouldn't we know about it? Granted, my own experiences suggest that getting new ideas out into the world can be difficult and riddled with misunderstandings, but I don't sense a premeditated conspiracy behind it. And people do know about my fexpr idea (even if they've still got some misapprehensions about it).

the reproduction of programming languages

Eh... I won't dwell on this because I don't want to drag the exchange too far from the topic but, no, I don't mean anything about "corporate overlords" and such. So, no, you don't quite follow my point.

Let us suppose Jane Smith has a notion for a programming language (call it "XL") that would, if fully played out, could really double the productivity of a programmer - at least in those applications for which the language was well suited. Let's even suppose she cranks out a working implementation, at least at a "working prototype" level.

My suggestion here is that, for the most part, XL has no quarter in this world. Who needs it? Big Tech doesn't need it because they make their revenue on a subset of their programmer payroll and the total salary to that subset is on the scale of "change I forgot about but found in the cupholder of my car." For those potential XL users,the transition to XL is pure cost and the gain really doesn't much shift the business' bottom line. Neither employer or employee has much incentive to pick up XL at all.

What I'm suggesting (not proving) is that XL fails to reproduce itself over time. Without any role in the dominant economic uses, who in the world has time to build libraries for XL, to make new implementations of it, and so forth. It tends to wither.

No conspiracy, just economic forces

JavaScript was designed and implemented in two weeks. The people developing Netscape understood that first to market, and familiar syntax to obtain momentum early, was much more important than low wat-age, consistency, flexible and reusable abstraction, or other properties that a PL designer whose main concern is not popular market success might care about.

It's really very difficult for improvements to propagate in such environments. Especially if they're foundational improvements, which would require pervasive changes to adopt into the existing structure.

Yet, there are enough new languages that do gain traction and eventually have huge libraries, often while being very similar to what already exists. I'm not too pessimistic about improved languages eventually gaining traction, but it might require some forces I don't understand well, e.g. a charismatic, extroverted promoter.

I am somewhat concerned that we might fail to improve our PLs before AI takes over most of the programming.

Similarities at other scales

The same phenomenon plays out on a local level as well. Suppose that Jane has taken a job with Big Tech. Most of the problems that she is asked to solve are variations on a theme. There is little novel creative work to be done, most of the work is mechanical in nature: take well known solution X, apply it to domain Y, make a small number of specializations Z. In this type of work the actual programming effort will account for maybe 5% of the total effort expended, while integrating the result into the existing body of work will account for 95%.

Jane has a limited budget (of time / attention / energy) and is probably being evaluated according to some set of metrics that are a proxy for speed. If she chooses to use her novel language XL to engage in this routine "paradigmatic" work then it will affect her speed in two ways:

* Velocity in solving the programming problem may increase.
* Velocity in integrating the solution into the existing code-base / tooling / deployment etc may decrease.

For Jane to win on this trade-off, she only needs the increase in velocity in programming to be 20x larger than the decrease in velocity in integration. This is why so few ideas achieve traction. For there to be a reasonable hope of coming out ahead there needs to be a significant bottleneck in the programming part that can be improved. Typically this only occurs with an entirely new domain where the old approach runs into problems.

It seems to me that rather than wandering off on a tangent, you are hitting the nail on the head w.r.t the original question. I would suggest (agree?) that the languages and approaches that are dominant occupy the evolutionary niche for the set of problems that we have been interested in for the past 50 years. The cost of change outweighs the benefits achievable within that niche, and so for large-scale change to occur there must first be a large-scale change in the target domain of problems that upsets the existing equilibrium. Until that occurs the expected-return on tooling to incrementally improve existing approaches will continue to exceed the expected-return of change. Elsewhere in the discussion John Shutt mentioned Kuhn's "Structure of Scientific Revolutions" - I find the metaphor to be a good fit for the "economics" of time invested in advancing programming languages.

The role of PLs

I see an interesting picture emerging here regarding what a programming language is.

Thomas Lord supposed "Jane Smith has a notion for a programming language [...] that [...], if fully played out, could really double the productivity of a programmer". My first thought on reading that was that merely doubling productivity (never mind quibbles about exactly what that means) doesn't sound like much; in fact, it's arguably trivial. I'd always supposed that fundamentally improving the abstractive power of a language —what I mostly used to focus on— would produce an exponential increase in productivity over time; I've tended to imagine it using big-O notation, so that a factor of two would not even show up. It didn't fully register on me, at the time, that this difference may be only superficially numerical; behind the numbers, there may be differing assumptions about what role a programming language plays in the software development process. I expect my assumptions are old, were naive when they were young, and haven't been reexamined for decades; it seems unlikely his assumptions could possibly do as badly, but there may yet be something worth questioning in them too.

Then, in elaborating Jane's local situation, you (Andrew Moss) bring out the role of the PL in more detail. You're supposing that an improvement in the PL will speed up the part of the task that is independent of integrating with the surrounding software environment, and slow down the integration with the surrounds. The linchpin of that supposition is that the particular programming language is all about the independent part of the task, and not about integrating with the surrounding environment.

Which brings out the archaic/naive core of my exponential-increase expectation: that everything will be written exclusively in just the one language — starting from a "base language", which we assume gets built into it whatever environmental integration is needed, everything thereafter will be written in the one abstracting language.

Obviously everything doesn't get written in one language (unless you count the entire multilingual mix as the chosen language). Is it equally clear that the role of a programming language does not necessarily exclude making it easier to integrate with the rest of the software environment? Seems to me these two visions of where a PL fits into the picture are both oversimplified (though I flatter myself that my vision, which I suppose I inherited from the Iron Age, out-oversimplifies the other by a goodly margin). It seems worth asking both what a programming language can do to facilitate integration, and what it can do to prevent its abstractive power from being undermined by destructive interference by the surrounding environment.

Well said, John, in many

Well said, John, in many ways.

Whether the benefits of good design (in a general sense) are additive, multiplicative, or exponential, the returns are significant. In the case of programming languages, this is certainly true, and does appear to be exponential (compounding).

As a study of the benefits, with personal reflection I can only compare my productivity over time, which controls for some difficult factors but allows others (which would be fixed in a real study) to vary dramatically. I think that one of the things that a good programming language does for its users is allow them to manage complexity better than the alternatives; most of our time and effort, as software developers and/or IT staff, is spent managing complexity. Additionally, projects that are more complex are the ones that have the highest incremental (likely exponential?) cost per complexity. So programming languages are, at their simplest, tools for managing complexity.

Unidentified, pervasive architectural mismatch (Call/Return)

The question of lack of significant progress is also one that's been troubling me for some time now. While the reason of such lack of progress could certainly be that significant progress is no longer in the cards, that we have basically reached the pinnacle of what possible, the amount of pain we still have developing even seemingly simple software systems makes that seem both unlikely and too awful to seriously contemplate.

Currently watching Brad Cox's interview for the Computer Museum's Oral History project, and among the many interesting tidbits was his observation that language doesn't much matter. His example was moving from C to Ada, but an Ada programmer wouldn't have much trouble orienting themselves in today's hotnesses such as Swift or Kotlin.

As far as I can tell, this lack of progress or lack of difference comes from the tacit consensus that all "programming" languages must conform to the call/return architectural style in some way, be it with procedures (imperative/structured), methods (OO) or functions of some defined purity (FP). (This has also been called an "expression/evaluation" model).

This model/style is rooted historically in the original use of electronic stored-program computers in roughly scientific endeavors, where the computer was used to, well, compute an answer given some inputs. Which fits the function/procedure model. And since that model was already present, it was also used to organize the computations and programs.

However, the majority of applications of computers no longer fit this model, so there is a mismatch between the systems we are trying to build and the tools we have for building them. And we can't abstract this mismatch aways like we can others because the problem is in our very abstraction mechanism.

So that's part of it.

The other part is that we certainly have had plenty of alternative paradigms/architectural styles/models, but they haven't really taken. Why is that? A part is certainly the path-dependency of having started with call/return. But I don't think that's the whole story. The other part is that those alternative paradigms are usually too restrictive. As an example, modeling dataflow in a procedural or OO language may be somewhat painful, but nothing compared to trying to model a recursive Fibonacci in dataflow.

So we hit brick walls and revert to some "general purpose" (read call/return) programming language and implement what we need as a library, mismatch and all.

And so our tacit consensus is reinforced, "programming", particularly when it is "general purpose" means organizing procedures/methods or functions to achieve our goals.

IMHO, the history of PLs in theory and practice of the last 50 years shows that any advance in programming languages cannot be something the replaces call/return. Instead, it must generalize both call/return and other paradigms in order to allow them to coexist and be adapted together.

See also: Why any Fundamental Improvement in Software has to be a Generalisation.

Also, my current attempt: Objective-Smalltalk

re call/return

Call/Return is an awkward fit for modeling interaction and concurrency. And 'effects' generalize as interaction between a computation and its concurrent environment, so this is a big concern.

Most concurrency models also introduce non-determinism, which is a troublesome property for many use-cases. Some introduce backtracking or global coordination, also problematic.

Fortunately, there are a few deterministic, local, monotonic, distributed, concurrent, interactive computation models. For example, Kahn Process Networks or Lafont Interaction Nets. I think this kind of model has the best ability to replace call-return.

those alternative paradigms are usually too restrictive

Just as a point to contemplate: There are plenty of models that vastly more very permissive and expressive than call/return, such as temporal logic programming. Arguably, one reason a lot of them fail to gain traction is that they aren't restrictive enough. It can be difficult to reason about their performance or integration with hardware.

Though, I think path-dependency is still the largest factor.

a claim without substance

Call/Return is an awkward fit for modeling interaction and concurrency.

That seems like a claim entirely without substance.

Concurrency is an issue due to only a smattering of problems:

1. State that is shared mutable across concurrent domains of execution;

2. Mutable state that is visible (or side-effects of any such state change are visible) across concurrent domains of execution;

3. The ability to recognize the passage of time across concurrent domains of execution.

That's it.

We can get rid of #1 by not having globally shared mutable state (C, C++, Java, C#, Python, etc.)

We can mitigate #2 by tying the predictability of execution to the domain of the data that it may mutate (and here, the normal syntax of call/return should work fine, even if it hides an asynchronous junction point).

And generally we can ignore #3 because it's just not worth fighting the passage of time.

This seems to be an area where making it more complex than it needs to be is a bad idea.

haha

If you want to reject my claim, your argument should be "here, look how easy and not-awkward it is to model these three interacting components using call/return". What I read above seems closer to "here, look at how well I ignore most issues".

Mutation and shared objects are not intrinsic to concurrency or to call/return. They are used in order to shoehorn concurrency into a call/return paradigm, i.e. because you need 'effects' to interact with other concurrent components through a shared resource before you return. This is already awkward.

I'd prefer to use concurrency to model effects and call/return, not effects and call/return to model concurrency.

Not being difficult

I'm having a hard time understanding your points, despite the fact that we're both seemingly using English.

I do think that there is a reasonably simple way to solve this problem, but I am not suggesting that it is the only solution, nor even the best solution, etc. So perhaps if I describe it, we can see if we are even talking about the same problem (because I now begin to doubt that point).

First, assume that a program is modeled as any number of connected von Neumann machines, each with its own processor and memory, but working with a shared type system. (This is in stark contrast to the typical model of "shared mutable everything".) You can imagine this in terms of Erlang, if it helps.

Second, assume that calls into one of these von Neumann machines from another of these von Neumann machines uses the same standard call syntax (and same standard call compilation) as a function (etc.) call occurring within a single von Neumann machine.

Third, assume that returns from one of these von Neumann machines back to another of these von Neumann machines uses the same standard return syntax (and same standard return compilation) as any function would use in returning from a call occurring within a single von Neumann machine.

Fourth, assume that each von Neumann machine is potentially running concurrently with and independently of all other von Neumann machines with which it is connected, processing the incoming calls in a FIFO style manner (again, imagine Erlang).

Fifth, assume that a by-product of the call on the caller side of that equation is a state representation of the call in progress; you can call it a "promise" or a "future" if that helps. Furthermore, assume that the programming model allows such a "promise" or "future" to be explicitly obtained and managed, such that the corresponding call is non-blocking from the caller's point of reference, allowing the caller to proceed in its own execution.

I would posit that such an arrangement is ideal if it can be implemented in a sufficiently efficient manner, in that calls that involve a separate von Neumann machine (a separate domain of mutability) appear to be as simple as "local" calls within the same von Neumann machine, and would by default provide the same blocking semantics (i.e. implicit async/await) as local calls. Visibility of information (other than that communicated via function arguments) is limited to the mutable domain of the von Neumann machine, so read and write barriers become unnecessary, as do guarantees around "happens before" and "happens after" with respect to concurrent mutation visibility.

A few other points to consider, in the model that we constructed:

* Functions (and methods and interfaces in general) can be passed across boundaries; if they are tied to state in the originating von Neumann machine, then each call to that function is actually a call into its originating von Neumann machine. This easily allows for reactive style programming models.

* All functions (and methods and interfaces in general) can be easily invoked in a non-blocking manner; instead of "Int i = foo()", one simply has to decorate the return site with a future annotation: "@Future Int i = foo()"

* Re-entrancy is an issue; the issues around re-entrancy do not magically go away. (I'd love a solution for that one, but I can't imagine one thus far.) Our solution was to provide manageable re-entrancy, and critical sections that built on top of the re-entrancy control. Since mutable data is local to the von Neumann machine, the only execution you are protecting from is your own, but that doesn't make it any less dangerous.

* Each invocation into a von Neumann machine from outside of the von Neumann machine is represented by a separate fiber. Each such machine is capable of running at-most-one fiber at a time; concurrency is _among_ these machines, and not inside of any one machine.

concurrency

Concurrency in computing is expression of computations in terms of interacting components. This includes stateless, synchronous, and deterministic models of concurrency, for example.

The problems you're trying to solve aren't truly 'concurrency' problems. They're 'call/return' problems. Particularly, the problem of using call/return (which, fundamentally, expresses a computation as doing one thing at a time then returning) to express concurrent computation. Reentrancy is a problem nearly unique to call/return. That you're even faced with these problems is awkward.

With call/return, you can barely avoid conflating other 'features' with your concurrency. For example, I'm not very fond of non-deterministic computations. If I ask for JUST concurrency, not that other stuff, call/return does not have any good answers.

You describe a design that I assume you've used with some success. But there are still many issues.

  • Reentrancy is a big one, which you've noticed. Your developers will be designing around that limitation with queues for deferred calls, or attempting to mitigate.
  • Responses always return to the caller, per the nature of call/return. But can lead to a lot of undesirable traffic, when what we want to express is sending the result as an input to another call, e.g. `foo(bar(baz()))` where foo, bar, and baz are on up to three separate machines.
  • Concurrent calls from other machines can add a lot of unpredictable latency to any remote interaction.
  • Multiple small calls with some intermediate computations is the normal mode for call/return programming. But with 'remote' calls, this pattern accumulates latency, has extra risk of time/state interference from other machines, and poor utilization due to waits. Futures can't help due to the intermediate computations with the results of one call as arguments to the next. Thus, your developers are under pressure to use and support ever larger calls that do ever more work, with ever more flags and parameters. Or to avoid using your concurrency as much as possible.
  • If you want to represent a consistent action across multiple machines, you're pretty much out of luck, unless you treat your machines as a compilation target from a nicer concurrency model.
  • Your concurrent calls will likely be 'second-class' in what types they support for call and return, e.g. with respect to pointers or futures. Yet, developers are likely to model references, e.g. session IDs for ongoing remote work. This requires some extra manual translation, e.g. via hashtables.

I don't want to be entirely negative on the design. It's at least more tractable than locks. I'm sure it can be used successfully, despite its awkward elements.

A short reply

As you know, explaining a large design in a small post on the Internet is the recipe for confusion, so I tried to keep the concepts clear and concise.

Despite all of my previous work in massive-scale distributed computing, all of those von Neumann machines that I described (likely millions of them) are most likely running inside of the same process space, on the same physical machine. So their invocation is almost certain to involve at least one layer of indirection beyond a typical CALL site, more akin to a dynamic stub/proxy model. It's hard to select words that will convey the design without opening more crates of Pandoras, but let me try: We will use dynamic code generation and statistics-driven code re-generation to shape the junction points between these multiple von Neumann machines, based on things like contention and backlog. The shaping and reshaping of those junction points will be invisible to both the invoker and the invokee; the guarantees are in the language model, and written in a way that allows that code generation a wide latitude in its approaches to optimization.

"Concurrency in computing is expression of computations in terms of interacting components."

Semantics are important. I'm not sure that you and I are using this word in the same way. (Insert Inigo Montoya meme here.)

Concurrency, from a typical developer's point of view, is the ability to de-serialize their program's operations in a manner that modern hardware and operating systems (etc.) can execute multiple de-serialized portions of their program at the same time (i.e. concurrently) without sacrificing the correctness of the program. Over the past several decades, programmers have accomplished this using processes, threads, fibers, semaphores, mutexes, hardware CAS support and other types of atomic memory operations, critical sections, queues, futures/promises, transactional memory, async/await, and so on.

I personally feel that we (as an industry) have not yet produced a compelling, simplified programming model that covers the needs of concurrency, while providing the simplicity and safety of serialized code, while providing the efficiency of the current hand-coded witches' brew of concurrency building blocks that I enumerated above. What I was describing was our (xtclang.org) take on this problem, beginning with some principles (e.g. no shared mutable state), and working from there.

Reentrancy is a big one, which you've noticed. Your developers will be designing around that limitation with queues for deferred calls, or attempting to mitigate.

Dealing with reentrancy (and having to deal with reentrancy) within a mutable context is conceptually unpleasant; however, you seem to be assuming a level of primitiveness that I do not recognize. Developers in Ecstasy do not deal with queues of deferred calls at all, for example; they simply call what they need to call, and everything else takes care of itself.

I do not want to derail the conversation away from the important points, but I will risk doing so by adding: Each von Neumann machine controls its own reentrancy policy with respect to other von Neumann machines; in doing so, a developer can constrain concurrency, temporarily if desired. But most developers will never have to deal with even this level of complexity.

Responses always return to the caller, per the nature of call/return. But can lead to a lot of undesirable traffic, when what we want to express is sending the result as an input to another call, e.g. `foo(bar(baz()))` where foo, bar, and baz are on up to three separate machines.

foo, bar, and baz would be on three separate von Neumann machines only if they were both (i) stateful and (ii) that state were mutable, thus preventing those operations from being executed within any arbitrary context. Only mutable state (!!!) cannot penetrate / migrate through the membranes of these von Neumann machines.

Concurrent calls from other machines can add a lot of unpredictable latency to any remote interaction.

Yes, of course. That is natural, and even a desirable property. We anticipate having millions of running fibers at any given time, on machines with only a few hundred hardware threads of execution. In such a universe, latency is not intended to be predictable. Manageable? Yes. Dynamically optimizable? Yes. But predictable, in the truest sense of the term? No.

Multiple small calls with some intermediate computations is the normal mode for call/return programming. But with 'remote' calls, this pattern accumulates latency, has extra risk of time/state interference from other machines, and poor utilization due to waits. Futures can't help due to the intermediate computations with the results of one call as arguments to the next. Thus, your developers are under pressure to use and support ever larger calls that do ever more work, with ever more flags and parameters. Or to avoid using your concurrency as much as possible.

Again, you seem to be assuming a level of primitiveness that I do not recognize.

First, I would encourage you to remember that these calls are conceptually remote, but not to think of that remoteness as their defining feature, because that is preventing you from seeing the purpose of the design. One creates one of these little von Neumann machines (what we simply call a service) in order to create the potential for asynchrony. The explicit use of a future is simply a developer’s explicit recognition of that potential for asynchrony.

Second, there are no explicit waits. Nor explicit locks. (etc.) The waits are implicit, just as with any function call: the code does not progress to the next line until the function has returned. On the other hand, using a future, the code does (conceptually) progress immediately to the next line, because the developer has recognized the potentially asynchronous nature of the call.

(The @Future annotation provides a simple example in its documentation.)

Your concurrent calls will likely be 'second-class' in what types they support for call and return, e.g. with respect to pointers or futures. Yet, developers are likely to model references, e.g. session IDs for ongoing remote work. This requires some extra manual translation, e.g. via hashtables.

Again, you seem to be assuming a level of primitiveness that I do not recognize.

There is a “limitation” on what types can be passed into and out of these von Neumann machines, but if you just close your eyes and think through this, you already know what the limitation is: Mutable state. For example, my code can attempt to pass mutable state to another von Neumann machine (another “service”), but in doing so, what is actually passed is a reference back to my von Neumann machine (which is how re-entrancy can even occur in the first place). The only things that can be passed among these von Neumann machines is (i) immutable state and (ii) references to von Neumann machines (i.e. proxies).

In other words, the code will still work, unchanged, but beneath the covers the reference to the mutable state is being proxied, so that the other von Neumann machine cannot directly mutate or even gain direct visibility to that state. All of this is done automatically. (No “session IDs”, no “manual translation”.)

re lightweight von Neumann threads

all of those von Neumann machines that I described (likely millions of them) are most likely running inside of the same process space, on the same physical machine

I suspected that might be the case. Nonetheless, the semantics you've chosen is for a distributed machine. There is nothing wrong with this, but it still has essentially the same implications for latency, shared FIFOs, etc..

For contrast, Kahn Process Networks (KPNs) are also a semantic model of distributed machines that is very frequently often executed locally on one machine. However, KPNs mitigate latency by using buffered channels instead of call-return for concurrent composition, and avoids pervasive non-determinism by not sharing or non-deterministically merging channels (at least without an explicit effect or extension).

dynamic code generation and statistics-driven code re-generation

Use of runtime profiling for JIT is an interesting and deep subject, but it's also getting pretty far off topic. :D

Semantics are important. I'm not sure that you and I are using this word (concurrency) in the same way. (Insert Inigo Montoya meme here.)

Indeed. When you kept assuming the 'problem' of concurrency is shared mutable state, I already knew you were mired in the paradigm of adding concurrency to imperative code.

Concurrency, from a typical developer's point of view

You'll find that the FP community has been more discerning here, distinguishing 'parallelism' for performance (ability to execute over multiple cores or devices) from 'concurrency' for semantics (ability for different subprograms to interact).

Wikipedia defines concurrent computing as 'overlapped' execution, emphasizing the modular nature - that the overlap is essential to progress of the computation. This also includes non-parallel versions, such as green threads.

Ideally, we want both. Concurrency without parallelism (e.g. green threads, coroutines, time-sharing) does work, but it's a wasted opportunity.

The typical imperative developer conflates the two concepts, and other concepts too (such as non-determinism). Yet, most also have no trouble accepting green threads as concurrency, despite the obvious lack of parallelism.

Developers in Ecstasy (...) simply call what they need to call, and everything else takes care of itself.

It's good that you can deal with some of the awkwardness via your compiler, rather than forcing every user to deal with it in the common case.

foo, bar, and baz would be on three separate von Neumann machines only if they were both (i) stateful and (ii) that state were mutable

This is not an unusual scenario, e.g. for pipelined composition of tasks.

Yes, of course. That (unpredictable latency) is natural, and even a desirable property.

I desire predictable performance. Although it would be reasonable to not prioritize it within your own goals, saying you desire the opposite sounds rather like you're sipping some Kool-Aid with your pinky finger raised. D:

The waits are implicit, just as with any function call (unless) using a future

I did understand this. Not sure what gave you the impression I thought waits were explicit.

code can attempt to pass mutable state to another von Neumann machine (another “service”), but in doing so, what is actually passed is a reference back to my von Neumann machine

So you're implicitly automating the creation of references? That's an option, I suppose, but it comes with its own set of challenges, such as life cycle management or distributed GC.

koolaid, oh yeah!

I think we're on the same page at this point.

You'll find that the FP community has been more discerning here

I would be disappointed to hear otherwise. Purity is a virtue; it's simply not one of my virtues.

Concurrency without parallelism (e.g. green threads, coroutines, time-sharing) does work, but it's a wasted opportunity.

Green threads and co-routines are excellent building blocks for concurrency; it's just that early implementations were designed for a single hardware thread. We use fibers, which are pretty much jut a different name for co-routines. (Some people define fibers as being a derivative of co-routines, while others define co-routines as being a derivative of fibers; one can posit that if two things are each individually a derivative of the other, that they are likely the same thing. In this case, they are not the same thing, but they do occupy the same general area.)

I desire predictable performance. Although it would be reasonable to not prioritize it within your own goals, saying you desire the opposite sounds rather like you're sipping some Kool-Aid with your pinky finger raised. D:

It is always difficult to convey thoughts adequately online. What I intend to convey is that the predictability that we are after is not the predictability measured in latency, but predictability of the computational model itself, and the ease with which a developer can mentally consume that predictability. I have spent countless hours debugging concurrency issues in code written by engineers who are far more brilliant than I, which is all the proof that I require to state unequivocally that our current programming models are broken.

So you're implicitly automating the creation of references? That's an option, I suppose

It is a reference-based language and runtime model. Much magic is hidden therewithin, and much magic yet to be created will also tuck itself into the same.

hmm

All this talk of concurrency and call/return is reminding me that I still have waiting on my to-do list, to properly study tonyg's disseration, Conversational Concurrency.

It's the whole "mother of invention" thing...

My take on this doesn't have near enough expertise to compete with some of the other answers here, but I think basic human nature plays a huge factor in how/if technology evolves.

The classic adage: "Necessity is the mother of invention" is generally agreed on by everyone, and certainly does drive progress in PL evolution. But only slightly less well known is: "The production of too many useful things results in too many useless people." (Honorable mention: "Laziness is the first step towards efficiency.") The need for accomplishing goals quicker, with less errors, and easier maintenance, was deemed a necessity that drove PL progress forward for decades.

But as those advancements took shape, the necessity dwindled, and the motivation to improve followed suit. That's not to say that the ingenuity went away. It just sought the next, easiest to eliminate, inconvenience. As an example, JQuery isn't its own language, but it--for better or worse--dominates JavaScript examples nowadays. Someone could have worked on a new, better PL to replace JavaScript, but the easier route was to create a library instead.

I work as a programmer, and often times I get an assignment that allows me to devise some clever solutions. But there's always a limit to how "clever" I can be. As an example, I built a C# console app that integrated two API's and used a custom query syntax tailored to Command Prompt and PowerShell's reserved characters. Fun, right? But you know what would have been even more fun? Making a brand new PL built to control DB API's from within command line environments! But that would have been months (maybe even years) of work, instead of the few weeks I had to Frankenstein something together.

It's like that everywhere. The ones who get paid to develop have "MVP" and deadlines. Conversely, the ones who do it as a hobby often create cool things with no obvious real-world application or "boots on the ground" insight. Now and then you get the next "GoLang" to come out, but it needs time to mature and saturate into the industry before it gets adopted and becomes the next "thing to get certified on." So I think the cog that this all cranks around is human laziness. We can only justify what is needed. If there's no incontrovertible need, then it's an uphill battle to get progress, even more to get acceptance, and even more to get conversion.

Haskell, Agad, Rust, Coq, HOL ?

Haskell, Agad, Rust, Coq, HOL ?

I reject the core proposition

Having programmed for over 50 years, I find the mental processes of programming have changed little, but the languages used to do it (and the tools) have progressed almost beyond recognition.

I started on Fortran, Cobol, Basic, along with machine code and assembler. Any HLL is much better than none, but these HLLs have only a few primitive types, plus arrays. And they are unsafe: programs that compile can crash and burn.

C, Pascal, C++ are a progression towards class-based type systems and encapsulation. This is a big step, but they are still unsafe.

Java and C# give away expressiveness for better safety (no more crash and burn). The dynamic languages (Python etc) give away more and are even safer.

We now have C++, C# and Java with highly expressive features including generics, iterators, closures, libraries like STL/Linq, and a myriad of competitors. To say these languages have not progressed in 50 years is a nonsense.

But we are overdue the next step, which in my view should start with safer. As long as the programmer has to think about alligators like null pointers, exceptions, cast failures etc they can't think about the important stuff like draining the swamp.

My bet is on safer, higher, shorter. But time will tell.

Safety isn't new

The story lists "APL, LISP, Algol, Prolog, SQL": APL, Prolog and SQL are pretty safe, and Lisp and Algol support alligator-sparse coding styles.

There are safety advances in your examples of "C++, C# and Java", but the safety story with each of these languages isn't completely straightforward. And Python seems to me to be a step backwards from 40-yo Common Lisp.

The biggest change is people's willingness to sacrifice performance for safety. And Rust contains definite advances that I think were not available 25 years ago.

Safety is simple but it isn'e easy

A safe language is one that cannot crash and burn. That is, you write the code, you compile it and if it compiles it runs and it doesn't stop until you want it to.

In 1985 I started a major project in C in real mode MSDOS. The process was edit, compile, execute, reboot. Defensive programming takes on a whole new meaning. It's exhausting having to be that careful.

That project is now 400K LOC C/C++, and it's virtually bulletproof. But change anything and there is a good chance you'll hit a pointer fault. C/C++ is not safe, you have to be really careful.

Java/C# are basically safer C++ with some good bits left out. They don't reboot or pointer fault, but they do crash and burn with null reference exception, invalid cast and lots of other things. You don't have to be as careful, but it's still not really safe.

I want a language at least as expressiveness as Java/C# but one you cannot crash because the compiler won't let you. I don't want to have to be careful, I just want to write code and focus on the logic, but leave it to the compiler to pick up code-level mistakes.

There are parts of that in Ada, Eiffel, maybe Rust, but not enough.

When they finally get round to fixing all the horrible bits in Java, maybe they should focus on safety. It pays off.

Maybe we're looking in the wrong place?

New, safer languages come out occasionally; after several iterations of repetition this often leads to a language that has that degree of safety and is also usable. And it is this second milestone, not the first, that I regard as showing that this kind of safety is an improvement.

But the fact of the matter is that when we're tying these new languages together and putting them on actual platforms, we wind up tying the knots on the package using unsafe languages, because our type safety gives us no way to do inherently unsafe BUT NECESSARY things.

Let's say I have a device that I can control by issuing a hardware interrupt on a particular port while the number I want the device to display, expressed in BCD or some other absurd bit format, is at a particular location in memory-mapped I/O, and which when the interrupt returns will have written one of a dozen different arbitrary bit patterns there to tell how the display operation went or why it failed.

Do you suggest that there is any "type safe" language in which I can write the lowest level driver for that device?

There isn't. Because, like the interface requirements of most hardware, it's an inherently unsafe operation. And it is an absolutely necessary operation if we're going to use that hardware.

So, I would say, that we can't address all the problems of programming languages by pushing for more type safety. There has to be, somewhere, some way to write lowest-level drivers that gives a clear and concise binary view of the underlying hardware.

The unsafe stuff that will definitely be different absolutely every time a new machine is made, has to be written. It has to be maintained, bugfixed, updated for new hardware, etc. And if people never push anything except type safety, they're never going to make that any easier or more reliable to do.

dup

dup

ATS

Do you suggest that there is any "type safe" language in which I can write the lowest level driver for that device?

I think there are a few, such as ATS or F*. Dependent types, substructural types, and refinement types are useful for expressing the communication protocols for low-level devices.

like the interface requirements of most hardware, it's an inherently unsafe operation

Type safety can guarantee your program is consistent with an interface description. Ensuring your interface description is consistent with the hardware does involve some 'unsafe' trial and error today. But I think this isn't essential difficulty.

A lot of hardware today is essentially 'compiled' from hardware description languages such as VHDL. At least in theory, it is feasible to support expressive types in hardware description languages, and to extract the hardware interfaces for use with drivers. Type systems in hardware description is a decent answer for the troublesome aspect of type-safe drivers, and would also be useful for modular hardware descriptions.

Of course, VHDL's type system is limiting in many ways. For example, if you have two pins that expect differential input according to a particular protocol, it is difficult in VHDL to express the coupling of pins or details of the protocol. A driver (or modular hardware) might attempt to use the pin with the wrong protocol. Thus, we'll still need sufficiently expressive type systems to fully solve this problem.

Bluespec

Have you looked at the Bluespec language, a functional hardware description language? A number of RISC-V processors have been designed using it. You can find the compiler and related tools on Github. Here Prof. Arvind (MIT) explains its parallelism and semantics in considerable detail.