LtU Forum

Design & Scripting Languages for Engineering & Scripting Reality?

It occurs to me that, as our world becomes more programmable - we need better languages for programming the world around us. Perhaps we can start discussing what those languages, and the run-time environments behind them, might look like.

We're in the Anthropocene, human activity has been the dominant influence on climate and the environment. We shape the face of the planet, and with almost 7 billion of our 8 billion carrying smartphones - we have reached the "singularity" - we have become globally connected rendering engines of an increasingly plastic reality. In a world of complex, self-organizing & adaptive systems - our dreams emerge into our collective subconsciousness (Facebook, Twitter, the Evening News), and we proceed to make those dreams real through our speech & actions. And so far, we've been letting our Ids rule the show - perhaps, because we simply don't have the language, or the processes, to be more deliberate about negotiating the world we want to live in.

The notion of a Holodeck has been expanded greatly in recent years - "Westworld," "Ready Player 1," "Free Guy" - we get closer and closer to the notion of designing & scripting the world around us. Theme Parks, "Reality TV," LARPs, CONs built around fictional universes, large-scale LVC (Live, Virtual, Constructive) military exercises ... bring us closer and closer to the point where we can deliberately design & script the world around us, at the stroke of a key.

But we're still living in a world where the pen rules. Westworld shows us engineers reprogramming the world from their tablets. Parzival & Blue Shirt Guy pull up virtual consoles, with super-user privileges. But, so far, the designs are conceptual - the GUIs are fictional, as is the code behind them. It's time that we start developing those interfaces & the run-time environments behind them. The Internet is increasingly the SCADA system for our planetary systems - it's time to start developing a new generation of protocols & languages.

Shakespeare wrote, "All the world's a stage, and all the men and women merely players." John wrote, "In the beginning was the Word, and the Word was with God, and the Word was God." In the introduction to the "Whole Earth Catalog" - Stewart Brand wrote "We are as gods and might as well get good at it." Learning to be better gods, starts with improving our vocabulary, grammer, and diction. We need better design & scripting languages for shaping the world around us. And then we can talk about Design & Engineering Processes.

We have MATLAB for systems modeling. We have have business plans, program plans, contracts, budgets & schedules for building systems. We have "mission orders" for military campaigns. But, the closer we get to the daily experience of life, the more nebulous our language becomes - we're back to natural languages that are not particularly useful for defining scenes, scenarios, characters, behaviors - for describing or scripting the stuff of reality. We use natural languages to write course catalogs & syllabi for universities; to write conference programs for events. We write shooting scripts for movies. Unity is simply not up to real-world tasks, like setting up an improv scene to be played out by avatars.

Playing Game Master is still an art, practiced by individuals. If we are to truly master our reality, to write our own scripts, and to live together as cast & crew in each others' games, we need better language for discussing & negotiating our roles & lines - rules of engagement & order for "reality jamming" if you will - ideally ones that let us all "Live Long & Prosper" in a sustainable world of "Infinite Diversity in Infinite Combination." We seem to have gotten pretty good at setting up for a "Forever War" (or perhaps one ending in Mutually Assured Destruction). Now we need to get good at setting up for a "Million Year Picnic."

The question is... what do those languages look like? What's the computational model behind them? What's the run-time bridge between thought, rendering the scenery, setting up the action?

This is a community of language designers - perhaps we can start the discussion here. Your thoughts, please!

A Manufacturer's Perspective on PL Progress

This may be the first post about manufacturing ever to appear on LtU. I offer it because, sometimes, we find lessons about our own fields in the fields of our neighbors. One or two of you will have noticed (perhaps even appreciated) my eight year hiatus from LtU. I've spent the last seven years working on a viable business model for American manufacturing through on-demand digital production. Testing ideas commercially has a way of making you look at them in somewhat different terms, and it seems to me that there are parallels between what has happened in manufacturing and what has happened in programming language adoption.

I recently rediscovered Milton Silva's post from April of 2020 Why is there no widely accepted progress for 50 years?, which prompted me to write this. The notion of "progress" is replete with observer bias: we readily notice advances on things that serve our objectives, and readily ignore advances on things that are not relevant to us on the day we happen to learn about them. So perhaps a better question might be: "Why has there been no uptake of PL innovations over the last 50 years?" Thomas Lord nailed the biggest issue in his response to Silva: "it's because capital, probably."

Because the parallels are direct, I'd like to give my view of why there has been no substantive progress in manufacturing in the last 50 years. With apologies to international readers, I'll focus on how production evolved in the United States because I know that history better; I don't mean to suggest for a minute that progress only occurred here.

The Path to Mass Production

In the US, the so-called "American System" of repeatable fabrication using standardized parts is commonly mis-attributed to Eli Whitney. Whitney mainly deserves credit for collaborating with Thomas Jefferson to copy the work of Honoré Blanc. Blanc, a French gunsmith, demonstrated rifles made from standardized parts in 1777. Blanc's methods were rejected by European craftsmen, who correctly viewed it as a threat to their livelihoods and the established European social structure. Unable to convince Blanc to emigrate to America, Jefferson (then Ambassador to France) wrote letters to the American Secretary of War and to his friend Eli Whitney describing Blanc's methods. Though unsuccessful in his attempts to copy them, Whitney did an excellent job of marketing Blanc's ideas in America. Among others, he found a strong supporter in George Washington.

Standardized parts took a very long time to succeed, driven initially through the efforts of the US Ordinance Group. Lore about Singer, McCormick, Pope, and Western Wheel Works notwithstanding, standardized parts did not yield cost-effective production until Ford's assembly lines generated standardized part consumption at scale. In the mid-1920s, Alfred Sloan created the flexible assembly line, enabling him to create the concept of a "model year" for vehicles. With this invention, Sloan resolved the achilles heal of mass production by removing the challenge of satiation. In both production and consumption, "more is better" became a driving thought pattern of the American 20th century. Through two world wars and several major regional conflicts, American mass production powered the economic growth of the country and a significant part the world.

Death Rattles

Viewed in retrospect, the re-entry of Toyota into the US market in 1964 with the Toyota Corona offered warning signs that mass production was in trouble. Unable to match the capital resources of American manufacturers, and unable to tolerate the costs of errors incurred in Deming's statistical quality control methods, Toyota had come up with the Toyota Production System, now known as lean manufacturing. US manufacturers had become complacent in their dominance, and were slow to consider or even recognize the threat this represented. By 1974, the Toyota Corolla had become the best-selling car in the world, and Detroit finally started to wake up.

By then it was too late, because 1974 also marked the global introduction of the standardized cargo container. With this invention, the connection between sea and rail shipping became relatively seamless across the world and the per-unit cost of international shipping became negligible. Labor and environmental costs became the principle variable costs in production, and both were cheaper overseas than they were in the United States. Standardized cargo shipping abruptly erased the geographical partitions that had protected US markets from foreign competition. By buffering transportation with inventory, the delays of foreign production could be hidden from consumers. Mass production began to move inexorably to countries that had lower labor and environmental costs for producers.

Basic arithmetic should have told us at this point that US mass production was in mortal peril: American salaries could not be reduced, so something else needed to change if manufacturing was to survive in America. Which prompts the question: Why has there been no substantive innovation in US manufacturing since 1974?

So What Happened?

There were several factors in the failure of U.S mass production:

  1. Complacency. Toyota started building trucks in 1935, but they did not release vehicles into the US market until 1957 (the Corona). In America, they were principally known as a sewing machine manufacturer (they built the best embroidery machines in the world until the business was taken over by Tajima in 2005). Their initial foray into the American market failed because the two markets had very different requirements, and they withdrew in 1961. The view in Detroit was that there was no reason to take them seriously. After all, they made sewing machines.

    The attitude in Detroit wasn't an exception. All of us are inclined to assume that the thing that worked yesterday will continue to work today, and consequently that market dominance is a sort of self-sustaining natural law. Which, for many reasons, is true. Until there is a regime shift in the market or the production technology. And then, sometimes very abruptly, it isn't true.

    Programming languages aren't competitive in quite the same way, but sometimes we indulge similar attitudes. In 1989, I was part of a group at Bell Labs that built the first-ever commercial product in C++. Cfront was fragile, and the prevailing view at the time was that C++ was a research toy that would never go anywhere. The Ada team down the hall, certainly, didn't seem all that worried. The waterfall model of software engineering was still king, and Ada was the horseman of its apocalypse. Both eventually collapsed under their own weight, but that happened later. Back then, C++ was young and uncertain.

    My 1991 book A C++ Toolkit described ways to build reusable code in pre-template C++. Picking it up now, I can only shake my head at how many things have changed in C++, and how those changes drove it to such a large success and to its eventual (IMO justified) decline. It succeeded by riding the wave of "abstraction is the key to scale". But in the early days it was largely disregarded.

  2. Capital. If you have an established way of doing things whose dominance relies on cost of entry and whose margins are thin, two things are true:

    1. New entrants will have to come up with a lot of money to compete with you using your methods.
    2. If the margins are low (which is to say: the time required for return is high), nobody with any brains will lend them that capital.

    Which is why there is essentially no investment in manufacturing happening today. Net margins in manufacturing typically run about 7%. If you can't get outside funding one way or another, you don't have much free cash to pay for innovation.

    Programming languages are similar: a new language in and of itself isn't very helpful. A significant set of libraries need to be written before the language is actually useful, which is expensive. And then existing programs need to be migrated, which is more expensive. And the benefit is... usually pretty uncertain. As in manufacturing, new methods and tools find success mainly by proving themselves out in a previously unserved niche. Clay Christensen refers to this as The Innovator's Dilemma.

    Of course, the other thing that happened by 1974 is that the microelectronics race found a whole new gear. Carver Mead's textbook Introduction to VLSI Design was released in 1980, and by that time VLSI was already mature. The problem for manufacturing was the the returns on investment in microelectronics were a lot higher than the returns on manufacturing investment. Not surprisingly, investors migrate to the most profitable opportunities they can see.

  3. Training. Buttonsmith (my company) has built our approach around a de novo look at manufacturing in the age of digital production equipment. The new approach relies on new software, new tooling, and new processes. With each new product, we validate our approach by deploying it to our prototype production facility to see whether we can make competitive, high quality products, at scale, in single piece quantity. It's a pass/fail test. Because our floor is constantly evolving, part-time production staff don't work well. It takes six months for a new production employee's productivity to offset their cost, and it takes a full year before or more before they really develop facility in all of the production activities we do.

    Shifting from mass production to on-demand digital production isn't an incremental change. It either involves a significant cost in re-training or building a new facility from scratch.

    Programming languages are similar. It takes a solid year for an experienced programmer to become facile in a new style of programming language. Syntax is easy. Understanding new concepts and new idioms, and more importantly where to use them, takes time. Which becomes part of the economic calculus of adoption.

  4. Ecosystem. When a manufacturer makes a significant change in their products or distribution, existing distributors and suppliers ask questions. Are they adding something new or are they planning to shift their business entirely? What will that mean for us? Is this a change that threatens our business model? If you open up a direct-to-customer line of business, your distributors may be unsettled enough to replace you with another supplier. This will happen before the profits from the new activity can sustain you. Which means that some kinds of change involve high existential risk.

    Partly for analogous reasons, new programming languages get "proven out" by scrappy risk takers rather than established incumbents. Python was initially released in 1989, but didn't gain mainstream traction until 2000. Even today, we see people argue that compiled languages are much faster than scripting languages. Which is true, but it entirely misses the point. Gene Amdahl, were he still with us, would be disappointed.

  5. The Inverse Funnel. In manufacturing, big things rely on littler things. So when the little things change disruptively....

    For more than a century, people have bought thread in spools made of a single color. Over time, industrial embroidery machines have been released with more and more needles to accommodate designs with more colors. The most capable machines today have 15 needles, and Tajima is generally viewed as the leader in such machines. Earlier this year, Tajima revealed a device that colorizes white thread, on the fly, using a digitally directed full color process, using a dye sublimation method. It is able to switch colors every 2mm, which provides essentially continuous color in the final product. So this year, for the very first time, it is possible to do full color embroidery with just a single thread. It is possible to do gradient embroidery. They expect to release automated conversion from PDF to colorizer instructions late this year. Without fanfare, embroidery has just become fully digital.

    A small irony is that the colorizer is typically demonstrated using a 15 needle embroidery machine with only one populated needle. Why? Because single needle industrial embroidery machines are no longer made. There is going to be enormous competitive pressure to adopt this colorizer, but the typical commercial embroiderer isn't rolling around in free cash. The good news is that their expensive multi-needle machines do not have to be discarded. The bad news is that cheaper single needle machines will return, and the cost of entry for a new competitor will drop accordingly. But all of the surrounding software, and the entire process of generating embroidery artwork, will be replaced.

    This is similar to what happens with new programming languages. Programming languages sit at the bottom of a complex ecosystem. The transitive cost to replace that ecosystem is awe inspiring. The more successful the earlier languages have been, the bigger that cost will be. A new programming language may be one of the most expensive transitions you might ever think to drive by introducing a new piece of technology!

There's obviously more to the story, and many other factors that contributed significantly in one example or another. The point, I think, is that when we succeed researchers and early stage engineers tend to be standing at the beginning of social processes that are exponentially more expensive than we realize and take a very long time to play out.

Change is very much harder than we imagine.

Thomas Lord an LtU regular, dies at 56

The full obituary is at the Berkley Daily Planet Thomas Lord 1966-2022

Thomas Lord died unexpectedly this week of a massive brain hemorrhage.

He is survived by his wife Trina Pundurs, mother Luanna Pierannunzi, uncle Christopher Lord, aunt Sharlene Jones, and many cousins and extended family.

Basic building blocks of a programming language

I've tried to understand category theory, but while I can't say I've got the mathematical skills to understand it, I think I kind of grasp what it intends to do. It is basically a way of building bigger things out of smaller things as flexibly as possible.

Designing my language I get inspired by this thought, and started to think along those lines. This means trying to realize what are the basic building blocks from which to build larger blocks, i.e. from where do I start?

In an attempt to figure this out, I thought about what languages typically tries to do. What I found so far are:

1. Calculate data, i.e. from data produce some other data, i.e. functions
2. Organize data into hierarchy (lists, structs, maps, let's call these types just to have some name)
3. Find patterns in data

The idea here is to make these elements as versatile as possible, so they can be combined without restriction as long as it is reasonable. Allow functions to generate other functions, hierarchies and/or patterns. Allow hierarchies to contain functions, other hierarchies and/or patterns. Allow patterns to describe functions, hierarchies and/or other patterns.

First of all, do you agree that these three could be used as basic building blocks, or is there something missing or wrong?

Secondly, the pattern. I can see patterns used in two that seems like distinctly different ways. One is that you write a template like code, which you could see as a patter
n. You insert a few values and out comes a type, a function, or something. Another way of using it would be to say this is an expected pattern with some variables in it, th
en apply data to it, and if it matches the pattern you get the value of the variables the pattern contained.

Perhaps those two cases of patterns should be named differently? Any thoughts on this?

My article on state machines and DSL evolution

I've been unhappy with with state machines, activity diagrams, and BPMN as tools for modelling processes for long time. In the article linked below I've used the same principles as were used for moving from flat languages (Asm, Fortran 66, BASIC, etc) to structured programming to create a behavior model that is some kind of behavior equivalent of Martin Fowler state machine model.

https://dzone.com/articles/evolving-domain-specific-languages

(Teaser) the end result for the sample is the following:

LOOP {
    ESCAPE doorOpened {
        DO lockPanel, unlockDoor
        WAIT doorClosed
        ALL {
            WAIT lightOn
        } AND {
            WAIT drawOpened
        }
        DO lockDoor, unlockPanel
        WAIT panelClosed
    }
}

Do we need exactly two binding constructs?

I'm recently thinking about the relation between objects and lexical environment. Objects, especially under a prototype-based object system, looks suspiciously similar to a lexical environment:

slot-ref <-> environment-ref
slot-set! <-> environment-set!
add-slot! <-> environment-define
parent(s)-of/delegate(s)-of <-> environment-parent(s)

However, one problem remains in the way of unifying those 2 binding constructs completely: the slot-scoping problem. If I simply use symbols (like the case for environment) to designate slots, there's nothing to prevent two authors to come up with the same name (say 'x) and they can clash, especially in the presence of multiple delegation/inheritance. Therefore I figure I must use slot objects rather than symbols to designate slots:

(within-environment user-1
   (define x (make-slot)))
(within-environment user-2
   (define x (make-slot))
   (make-object x 1 user-1:x 2))

and... now environments bind symbols to slot objects, and objects bind slot objects to values.

This all looks fine, except that it makes me itch that I need to have two almost identical constructs, and I can't unify them into one! Are those two binding constructs exactly what we need, no more, no less?

Cicada language -- a new dependently typed language

Cicada language is a
dependently typed programming language and a
interactive theorem prover.

The aim of cicada project is to help people understand that developing software and developing mathematics are increasingly the same kind of activity, and people who practices these developments, can learn from each other, and help each other in very good ways.

Homepage at: cicada-lang.org

CUE: An open-source data validation language

There are two core aspects of CUE that make it different from the usual programming or configuration languages:

- Types are values
- Values (and thus types) are ordered into a lattice

These properties are relevant almost to everything that makes CUE what it is. They simplify the language, as many concepts that are distinct in other languages fold together. The resulting order independence simplifies reasoning about values for both humans and machines.

It also forces formal rigor on the language, such as defining exactly what it means to be optional, a default value, or null. Making sure all values fit in a value lattice leaves no wiggle room.

The application of set-theoretic types is getting more and more mainstream as they turn out to be very effective in (partially) typing dynamic languages (i.e. Typescript). Other popular examples are Scala(3) and Kotlin, together with other (less mainstream) examples such as Ceylon, Typed Racket and the Avail programming language.

I've dabbled in Avail a few years ago, and was very impressed by Avail's type driven compiler, but not so much by its (extreme) DSL support. In contrast to Avail, I believe CUE is much more pragmatic while not allowing any 'tactical' compromises.

CUE has its origins at Google: the ability to compose CUE code/specifications is really useful at the scale that Google operates. The creator of CUE - Marcel van Lohuizen - has recently quit his job at Google to become a full-time developer of CUE. I think Marcel has created something really interesting, so I understand his decision!

Shen Standard Library

The documentation can be viewed on https://shenlanguage.org/StLib/stlib.html. Download on https://shenlanguage.org/download.html.

Mark

Trojan Source: Unicode Bidi Algorithm abuses in source code

A recent chunk of research caught my eye.

The skinny is that the unicode bidi algorithm can be used to do all kinds of mischief against the correspondence between the visual display of code and the semantic effect of it. To the extent that the manipulation of code semantics is almost but not quite arbitrary. It's exceedingly easy to use this technique to hide trojan-horse type vulnerabiity in almost any language that accepts unicode source.

My own analysis of this is that this comes about because the 'tree' or 'stack' of bidi contexts is not congruent to the 'tree' or 'stack' of semantic source code contexts.

In short no bidi context can extend beyond the bounds of the most specific syntactic context in which it was created. And that includes 'trivial' subtrees like comments and string constants. As the compiler exits a clause, it must report an error if the bidi state that existed when it entered that clause has not been restored.

And even within the subclauses, keywords and structure must begin and end with the same bidi context. For example if the language has an 'if' subclause that's

'if ('+predicate +') then' + consequent + 'else' + alternate 

Then the predicate, consequent, and alternate clauses can push/pop bidi states within themselves, but keywords and parens must all begin and end in the exact same bidi state. And that bidi state - the one that existed before the parser started reading the subclause - must be restored when the parser leaves the subclause.

Right now most unicode-accepting programming systems are treating bidi state as irrelevant and bidi override control characters as whitespace. This means that code that looks exactly the same in an editor can be read in different sequence and have vastly different semantics.

Forcing bidi level to remain congruent to syntax level through the program means that a program that displays in a particular way either has a single valid semantics according to its sequence order, or the parts you see are in some non-obvious sequence order and it is therefore a syntax error.

XML feed