process oriented PL boot-strapping

A few days ago I had a somewhat interesting idea about bootstrapping PL (programming language) tools and components after you assume a process-oriented architecture. This post amounts to sharing. The entertaining part, if there is one, will become apparent on reflection. To make this less concrete, I'll start with a minimal bit of abstraction about what process means.

Suppose asp means abstract sequential process (chosen so pronounced as one syllable with an arbitrary concrete/visual image of a snake suggesting a chain of continuations). Then say nop means native os process, typically implying an address space and whatever the OS does in the way of a process. When you launch an app written in C on some platform, typically it starts a new nop and eventually gets around to calling main() in the main thread, which is the root of the asp which lives in that nop.

An asp is something like a fiber, but a typical C program with no threads and no extra fibers will amount to one nop, one thread, and one fiber, thus seeming synonymous with the asp as executing code in a general sense. The idea of asp is only useful when you think about mapping it differently. You can run an asp in a thread (inside a nop), or you can run an asp inside a fiber (inside a thread inside a nop). If an asp talks to the world only via messages, for example using standard input and output streams, then you can map it somewhere else while preserving behavior. An asp run as a standalone nop would be similar in behavior to one run in a thread or fiber, but be much easier to debug that way in isolation, where it can't corrupt memory in another asp, for instance. It doesn't matter what language an asp is written in, when the only way you interact with it is via messages.

Here's the part I find entertaining. To bootstrap, you can start with an asp written in a way you find totally unacceptable in the long run, as long as it helps you write new versions you run later with better properties. You can write an interpreter that's a slow memory hog that does everything slowly and monolithically, but use it to write assorted tools and transpilers that later run much more efficiently and asynchronously, perhaps compiling itself to a better form to use in subsequent invocations. Since every asp is independent, except to the extent it depends on cooperation of other entities it messages, you can run old and new versions of an asp at the same time, to compare in tests or to stage incremental service upgrades.

The same general daemon nop architecture can be re-used in each version: something that spins ups and listens to connections, running asps in threads, perhaps inside fiber pools within. You can start with one as a root session that starts other daemon nops as requested, when you want parts of service isolated or restartable without a root session reset. Maybe one child nop is the virtual file system, for example. You can rebuild executables and launch new child nops with upgraded features to replace old inferior child nops, then shift service to new instances. As long as you are connecting everything with channels managed by daemon non-blocking i/o features, it doesn't matter how many native processes you have, or where things get assigned, and you can use config specs to change where things go.

I thought it would be fun to be able to write terrible (but easy to understand) early versions of things that can generate new and better versions that run with identical external process interfaces you can swap dynamically. Bad scaffolding architecture doesn't need to infect or influence later improved versions. So you can start out writing things in the most general way you like, then slowly generate concrete translations of that into the target quality you want to read over time through stepwise evolution. This idea might help beginners in PL research, since it means you don't need to be especially careful about your starting point. Process separation implies disposable steps.

Seems like a good plan

For my project, I had already planned to write an expedient interpreter first, so I can work out the front end early. However, I hadn't really thought about putting in a multitasking infrastructure up front.

One problem is that I want to be able to pass generic resources to any subtasks. I suppose, for a jury-rigged startup, I could have a parent process mediate any system I/O for everybody, and take instructions over message channels from other tasks. However, it seems like that would make passing large memory-resident structures inefficient.

Another point is that this would tend to drive me to choose a parallel architecture, and I want to be careful and choosy about that. Funny thing, I started with a particular view of the kind of parallel operations I wanted to support (i.e., specifying order and dependencies of external parallel communications), and chose a referentially transparent functional model to support it. However, after working on that for a while, I fell from my original faith, and now find myself strangely agnostic about which parallel models I want to support...

By Bruce J. Bell at Sat, 2015-05-30 07:10 | login or register to post comments

mainly trying to give folks suggestions

To keep my first post on-message (processes and bootstrapping), I skipped over several associated ideas, because inspiring folks to develop things was my main priority. Paralysis due to perfectionism is an easy trap to wallow within, waiting to start until a plan is clean enough, even though ongoing version replacement is a natural steady state to reach anyway. (You will never be done, unless a project dies and is scrapped.) Toy versions of things are a good way to bootstrap industrial versions. And it also helps with testing, since "do it twice and compare" is a fine tactic to check for consistency and/or conformance.

Throw-away code is the bane of developers, and seems a total waste of time, to write something you plan to throw away. I suggest you don't throw away simple versions, and keep them for bootstrapping and tests, so it's not a waste. But as you evolve the best version of something, there's a temptation to not maintain old versions too, because it costs too much, even if that was the only way you were going to compare. Abandoning tests seems like a slow way to kill a project with technical debt, over time, as you lose your handle on evidence about what happens.

Now I'll focus on your generic resources and parallel architecture points. Let's do resources last. To support parallel execution when possible on hardware, code needs a concurrent architecture, such that it doesn't know whether serial or parallel execution happens. Both parallel and concurrent are longish words, which sucks. There ought to be shorter words conveying similar ideas. To go find them, first you need to think about the basic point, which is sub-division into tasks that need not be serial, and what characterizes the entities doing this, like processes. Instead of one thing, you have a cooperating group, organized in some kind of federation ... except that's a long word too -- what are short synonyms? Note it's important to avoid picking homonyms that sound just like existing technical terms, so avoid corps which sounds like core, etc. A group of processes might be a band, a flock, a herd, a troop, or a shoal, etc. Instead of concurrent, we might say banded, for example. Instead of focusing on concurrency per se, we might instead attend how process groups cooperate, and how they frame the story about collective action. Then we can write prose with more direct/active verbs, instead of talking about passive qualities like 'they are concurrent', because it smacks of defining things by negative space of what they don't do: well they are not serial.

When I'm not interested in most PL work, the main reason is absence of multitasking infrastructure, as if it's safe to ignore until you write an app that interacts with a server. An idea lots of things are happening concurrently, and a given piece of code is just one fragment, is something you want to make sense in your language's world view, so you aren't fighting to frame a valid perspective when developing. If I bring up a language REPL, I want it to be just one of N things happening, not the sole flat world view into which parallelism must be retrofitted poorly. You can get clunky designs, like designing everything to be "multithread safe" so you can spawn threads and spend the rest of your life stamping out bugs from races in object-orented code that pretends concurrency is a minor after-thought managed like an aspect. (Chimps-with-revolvers school of coding.)

Part of architecture should be at least one flavor of interaction that disallows shared mutable state, so interacting members of a group might be in different native processes, or the same one but with immutable access to shared state. So messages might be copied or they might be shared, and this abstraction permits a process to move, during one session or across multiple sesssion. (In same native process is best for optimal performance with huge numbers of small tasks, but worst for diagnosing faults when memory corruption is a potential factor.) If your language's build model includes translation to C and compilation with an external development stack, you can use shell commands to build new executables concurrently, then let new processes join a group dynamically at runtime, without restarting the process coordinating everything. If your program is like a house, you can add a new room without tearing down the old building, if rooms were loosely coupled in the first place.

I have way too much to say about resources, so I'll narrow it down to just two things. First, I want to be able to kill a process and get the resources back, even if this was just one of N tasks inside the same native process. Partly as consequence, I don't want global resources owned by no process, since that would allow memory exhaustion, for example, without a process to blame. Shared resources should be owned by someone (and killing that process might cause everyone using that shared resource to die subsequently as well). Second, I prefer a namespace-oriented manner of getting access to resources, so subtasks don't have direct control over the bindings, so parent systems can pass resources generically by namespace operations. In this context, a virtual file system works pretty well. I'm inclined to have this model extend to PL module namespaces, so symbols are defined on paths, where these paths are also visible in the VFS. I think a path to meta-info about symbols should appear in the file system, even if that part is hosted in local native process memory. Browsing the current state of the symbols system from shell command line or browser could work.

By Rys McCusker at Sat, 2015-05-30 22:23 | login or register to post comments

Hey, I resemble that remark!

That is, the "waiting to start until a plan is clean enough" is an accurate diagnosis of my current situation. I agree, I need to try writing a prototype sooner rather than later, and an inefficient toy version is probably just what the doctor ordered. But, to continue my analysis paralysis for just a bit longer...

The properties of the platform an implementation is built on tend to shape the language it supports. For example, I'm currently looking into Scala, which is built on the JVM; this fact strongly colors its design, in ways I have no wish to emulate. This says nothing against the language itself -- but, it's not what I'm looking for.

On consideration, the above is a particularly good reason for choosing a concurrency model for an early implementation. However, it is an equally good reason for choosing carefully. So, enumerating options from null: a purely functional subset can get away with indeterminate evaluation order. On the other hand, any useful concurrency requires fairness as some kind of systemic feature, which in my book ought to be expressed as a language primitive, if you want your language to handle concurrency natively.

Beyond that, there are three concurrency models that seem suitable, but they seem sufficiently different that I hard time comparing them. I'm also not sure which, if any, are appropriate for building into a quick&nasty prototype, with an eye towards coloring a language suitable for systems engineering...

First, the external concurrency I mentioned earlier. I'd like to be able to specify dependencies of external communications as a directed acyclic graph, and I haven't come up with a particularly usable way to do so. Maybe this is more suitable for a, um, graphical visual interface, but it still needs to be representable in conventional textual format, and it needs solid semantics in any case. Possibly something derivable from pi calculus? I would prefer to express communications as one-to-one single channels; if you need to arbitrate between multiple channels, use an arbiter.

Second, I'm a fan of software transactional memory, but uncertain about the most appropriate scope for its use. It seems like the right way to do the kind of concurrency it supports, but it also seems like a bad idea to wedge everything into that model. As opposed to the external concurrency thing, it seems focused on internal concurrency, even to the exclusion of non-STM effects.

Finally, I actually like the old fashioned monitor-with-conditions, as a plausible general model of OS-supported processes. As with the external concurrency case, I prefer one-to-one communication links.

By Bruce J. Bell at Tue, 2015-06-02 14:58 | login or register to post comments

somewhat random feedback without a theme of my own

I appreciate hearing your ideas. I can offer a little free association. Part of my first point was an idea of making tools to make more tools, where simpler prototypes can help make more complex products. (And maybe more complex versions can later consume the same simple early inputs.) When inputs and outputs are related to making a language work, or its runtime, there's potentially a lot of sessions involved. The execution behavior of early sessions need not reflect on what you get later, after some evolution. I suspect tunnel vision about the end can blind folks to cheap tactics in the beginning. Now I'll try to focus on your remarks.

The properties of the platform an implementation is built on tend to shape the language it supports.

Definitely true if the runtime of the platform is the target, but need not be true if the target runtime is something else, another platform. I guess it depends on what you want to happen. (Saying that made me cringe a little, as too obvious, but folks sometimes forget to say the most obvious thing, which is "my list of desires" when focusing on a plan in response.) For example, it may matter whether compilation will happen at runtime too during production, and whether this can freeze a service into relative unresponsiveness. The way you might want a compiler implemented could vary if sometimes you want it to operate under non-blocking high volume network loads. I don't think many folks want that though.

Maybe there's no need for a concurrency model when a PL-dev designing a language has no need for concurrent behavior at runtime. But there's no reason why the resulting language would be any use to another dev who's highly dependent on concurrent runtime effects.

I grasp developer psychology pretty well when serving self need -- designing code for personal and group goals. But I'm not good at all when it comes to dev pyschology oriented toward getting positive feedback, developing specifically for praise, where "what people may say" can dominate concerns. Mass psychology has always struck me as pretty dumb, and having ignored it too much, I don't understand people motivated by it. Folks deeply sensitive to status are like schizophrenics who hear voices to me; I can't relate to either well, and it seems pointless to try. Thus the entire dimension of "what is now popular in PL design?" seems an irrelevant concern to me, unless serving a market niche driven by a popular mindset. It never occurs to me ask what popular target platform is wanted by a market segment, even most market segments.

Instead I tend to assume someone working on a PL is trying to get an effect now hard to realize with existing tools, or otherwise an existing tool would be just fine to use as it is now. Why make a new PL when no effect is easier to get? For sake of amusement or play, I understand. But as a serious goal, I struggle to follow. When someone targets the JVM, I assume it's a DSL that is more palettable [sp. palatable], since you are somewhat at the mercy of however it behaves, and can't easily get different effects unless you wanted to say no a lot more often (because being strict is entertaining). So my unconscious expectation is that platform is wide open: you would target whatever you wanted for the new effect desired.

I'm trying to avoid characterizing my own needs. It's not helpful, since not many folks have similar requirements. To even test certain kinds of thing, I need to be able to generate large distributed loads, or simulate them as efficiently as possible, when not much hardware is around. I can never have a testbed with as many machines as the number of endpoints I want to simulate in mobile apps. But I want to fill hardware bandwidth at hand, without wasting a lot of cycles on runtime admin. I want effective actor count to get as high as possible with local hardware, at performance that will be a green light for shipping.

I tend to think about HyperCard when imagining something that makes sense to other folks, instead of ridiculous infrastructure problems. But it seems like you'd want a frontend to compile to JS for browsing, with a weird separation between front and back ends. How can you dispatch a mouse-enter event on a field whose content and behavior just changed concurrently in the backend? Do you show it's frozen awaiting update before the UI responds as expected again? Stuff like that. (I don't like thinking about UI very much.)

I also like old fashioned monitor-with-conditions for shared mutable state visible across (lightweight) processes, but I would probably only do that inside one thread, with messaging for between-thread and between-native-process activity.

By Rys McCusker at Wed, 2015-06-03 02:49 | login or register to post comments

resources as chained linear values

On the Rust thread, there is some discussion of linear values in functional programming being similar to imperative values minus aliased state, which is basically the tack my project takes.

I've been interested in linear types since I read Girard's paper on linear logic. But looking at what happens in languages (such as Clean) which use linear types directly, you almost always see a chain of operations that bind a sequence of linear successor values to a sequence of similar names, which seems suggestively redundant.

I prefer to present "resources" as chains of linear values bound to the same name, and "operations" as functions that consume one such linear value, and implicitly provide a successor value. This is basically what programmers expect out of their mutable/imperative variables, but constructing them this way provides a handle which allows the language to enforce referential transparency, as a form of draconian alias control.

Because the elements of the chain are linear, the type system will not accept providing the same resource multiple times: accesses to any given resource must be linearized. This is essentially the same thing that happens with "naked" linear values, but in a more convenient (and familiar) form. Using this construction, the language can also require that each use of an operation describe by name the resources being used. This makes effectful computations referentially transparent by construction, as well as straightforwardly tagging them with types describing their associated effects.

Note that, like a regular function, this kind of operation is a simple value -- unlike a resource, it can be reused, and composed to form more complex operations. In this last respect, they seem similar to Haskell-style monads (although I think they ought to be more flexible in practical use?)

One potential problem is that the alias control might be too draconian, rendering the language too inexpressive to be useful for large-scale systems. (Although, it might not be a bad idea to start there, and introduce more detailed facilities by extending the type system?)

Also, to get back to the topic at hand: can this form of resources and operations support the kind of native concurrency appropriate to a systems programming language?

By Bruce J. Bell at Wed, 2015-06-03 01:49 | login or register to post comments

This is basically what

This is basically what programmers expect out of their mutable/imperative variables, but constructing them this way provides a handle which allows the language to enforce referential transparency, as a form of draconian alias control.

So typestate, basically. Which I think is perfect for a systems language, better usability than linear types IMO, assuming you can adopt borrowing ala Rust to make the aliasing limitations more palatable.

By naasking at Wed, 2015-06-03 03:54 | login or register to post comments

Instead of complicated borrowing, make variable races impossible

Instead of complicated borrowing rules, make variable races impossible, e.g., see Variable races are impossible in ActorScript

By Hewitt at Wed, 2015-06-03 09:08 | login or register to post comments

I do not follow

I did try to skim/read that section of the linked paper. I am guessing that you are saying variable races inside a single actor can be rendered impossible, e.g. simply by saying that actors have run-to-completion handling of each incoming message?

Whatever happens inside an actor to keep things safe and sane is nice, great. But the classic problem with actor-y systems is that nothing inherently about actors prevents all the same old issues happening at the next level up of code that coordinates actors in any way, I hazard to guess.

By raould at Wed, 2015-06-03 21:46 | login or register to post comments

non-preemptive actors (or processes)

(I didn't read it either.) I assumed Hewitt meant actors are single-threaded, where only one thread can read/write a variable, so no pre-emptive races exist. I believe all lightweight processes should be like this too, so only cooperative races can exist, across multiple async operations that admit cooperative scheduler pre-emption.

But yes, there is still the problem of coordinating many related reads and writes that must occur together in a critical section without interference. Even with a single writer making all changes, the moral equivalent of long-running transactions might be needed in actors or lightweight processes. It's the problem of whatever layer adds the semantics of "all these changes must occur together with this invariant" -- that layer suddenly takes on the responsibility of being a broker coordinating transactions.

(Edit: I agree with Thomas, it's not a good idea to do this, but you can. Care to avoid hanging yourself always comes up.)

By Rys McCusker at Wed, 2015-06-03 22:12 | login or register to post comments

re non-preemptive actors

still the problem of coordinating many related reads and writes that must occur together in a critical section without interference

If it hurts when you try to do that, don't.

Either you can send a transaction spec for the whole set of related reads and writes to a single actor or you can't. If you can't, change the spec of your program to eliminate the appearance of a need that you must do what you can't do.

It's the problem of whatever layer adds the semantics of "all these changes must occur together with this invariant" -- that layer suddenly takes on the responsibility of being a broker coordinating transactions.

Doing that at scale is easy if you don't mind it being slow. If you need that at large scale AND fast, redefine the problem to get rid of the imagined need.

By Thomas Lord at Wed, 2015-06-03 22:18 | login or register to post comments

seeking enlightenment

Please do post any urls to anything I can read to help me get some ah-hahs about your guidance.

By raould at Wed, 2015-06-03 22:55 | login or register to post comments

batch efficiency

Batch operations are generally more efficient than successive piecemeal effects, especially if failure is costly, or if round trip latency is high per operation because it causes context switches (that may easily run other things before your own follow-up).

For example, if you need three things (locked or allocated), then ask for all three things at once so the batch as a whole succeeds or fails. You can also say "I need to reserve these three things" and then await notification you have them.

A group of successive operations can also be done more efficiently if the callee knows what you have in mind. As separate operations, no one knows what you are going to do next, unless you declared a flight plan in advance. When you reveal the whole plan, a more efficient way to do it might exist because of private internal details unknown to you.

It makes an api more complex to express a set of reads and writes as a lump, with some logic to decide whether and how to do it, but clarity of transaction is improved, and performance too when someone optimizes what you requested.

By Rys McCusker at Thu, 2015-06-04 00:02 | login or register to post comments

re enlightenment

It's not very complicated. Rules of thumb:

At small scales, fast multi-mutation transactions are easy.

At large scales, fast multi-mutation transactions are hard.

At large scales, slow multi-mutation transactions are easy.

Sometimes people hand you a spec that calls for, e.g., fast large-scale multi-mutation transactions.

Before trying to overturn the unverse and make fast, large-scale transactions easy, look deeper into the particular need the spec is trying to fill in the context of social production and social activity.

Odds are probably good that you can reframe the problem to be solved in such a way so as to eliminate the need for large-scale but fast complex transactions.

If you can't reframe the problem to eliminate that need, then there's a second question: is the "need" to solve this problem a fantasy, like asking for dry water? or asking for the ability to order back the tide? It might be theoretically possible to order back the tide but you wouldn't necessarily want to live in a world where people are brute forcing that kind of hack.

PL design "should" (I don't know where I got this normative hat but I'm putting it on for the moment): PL design "should" make easy things easy and encourage hard things to be reconsidered.

Better?

By Thomas Lord at Thu, 2015-06-04 00:11 | login or register to post comments

sort of

It is more clear to me when I read the actual real-world examples that come up in, for example, the DDD crowd. So, yes, I do get what you are getting at now. And for sure agree.

By raould at Thu, 2015-06-04 18:02 | login or register to post comments

Actors are based on message-passing semantics; not preemption

Actors are based on message-passing semantics; so preemption doesn't make much sense.

By Hewitt at Thu, 2015-06-04 21:24 | login or register to post comments

re actors are base on....

"Actors are based on message-passing semantics; so preemption doesn't make much sense."

Right. The counter-argument someone made was roughly:

Even though individual actors have message-passing semantics, the problems of asynchronous mutations to shared state can still occur at a higher level of complex assemblies of actors.

So I was just replying that: Perhaps analogous problems can arise at higher levels but if a spec seems to require some complicated solution for the higher levels, the problem is presumably the spec.

By Thomas Lord at Fri, 2015-06-05 17:47 | login or register to post comments

rip typestate

things like typestate is dead, long live typestate make me shudder as i read them.

there was typestate in rust, then it went away, now we can get the same effects only less so with a lot more weird work and have to start explaining design patterns?!

i dunno.

By raould at Wed, 2015-06-03 21:41 | login or register to post comments

Lambda the Ultimate

User login

Navigation