Adding Concurrent Constructs to a Language with State

There is a lot of discussion on LtU, past and present, about the advantage that a pure functional language has w.r.t. concurrency, but for several reasons that I won't go into here, I am designing a non-pure multi-paradigm language called Virgil. Originally I focused on making the language usable for microcontroller programming, which is an interrupt model that I managed to accomodate without introducing any constructs for concurrency. Virgil had no synchronized { ... } blocks, no atomic { ... } regions, and no event system. Instead, the programmer could attach methods to what I called entrypoints which represented where control would transfer to in the event of an interrupt (or nested interrupt). Thus a one stack model would suffice. Managing mutual exclusion was done manually by manipulating hardware state.

But now I want to make Virgil into a larger, more complete language. I have been redesigning the syntax a bit and building a new compiler in the language itself, bootstrapping off the earlier compiler's interpreter which I wrote in Java.

I am faced with the question of what concurrency constructs to introduce, and how they can express these kinds of concurrency:

- Interrupts: e.g. in a one-stack interrupt-driven microcontroller
- Events: e.g. in a GUI framework
- Async IO: e.g. in a high-performance IO subsystem for a webserver
- Threading: e.g. multiple concurrent threads of computation for multicore
- (others)

It seems that other languages have made the leap from a single-thread, non-current model to a different model either through the introduction of monitors and thread (e.g. Java), a threading library (e.g. C), atomic regions (e.g. nesC), or other language mechanisms.

I'm curious if people have advice in this direction.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

My 3 favourites

Join calculus
E's event loops
Orc orchestration language

[Edit: all of the above concurrent languages have been added to general purpose languages with state, so they should meet your criteria without being as error-prone as programming with threads and monitors]

The problem with concurrency isn't state...

it's shared state.

The reason FP languages avoid state, even in the sequential case, is for referential transparency--which permits quite a few analyses that the presence of mutable cells makes intractable. But if you are going to have a stateful language, there are still good reasons to confine "general" mutable state to a within a thread--i.e. each thread gets its own mutable state, but they can't share it. Threads only communicate via messaging or other "safe" means.

Given that, here are a few interesting means of communication between threads in a concurrent system:

* Dataflow variables. The dataflow variable is a shared variable which only set once, but read from many times. Setting can be limited to a particular thread, or any thread might be able to set it--but only the first will succeed. (Subsequent threads that might try to set it to the same value might also "succeed"). Reads prior to the set may block or fail. See PVR's excellent book for more on this style of concurrency.

* Message queues. A message queue is a FIFO data structure that in the concurrent case, is generally owned by a particular process--other processes can post messages on it. Messages are generally copied from one process's memory space to another, though on a shared-memory machine, shared memory might be used under the hood (if data is immutable, a reference is generally indistinguishable from a copy). Message queues can be implemented "on top of" dataflow variables, above.

* Mailbox. Mailboxes, as in Erlang, are essentially message queues with extra processing on the read end which permit nifty things like out-of-order reads. All the fancy stuff occurs in the context of the reading process.

* Object spaces, or tuple spaces. A form of shared memory which is "safe"--essentially a big blackbord of objects from which processes can concurrently read and write. Implementations exist for both shared-memory architectures (which is easy) and networks (which is harder).

You might also study the various process calculii and parallel models that have been developed over the years. Actors, CSP (communiticating sequential processes), CCS (calculus of communicating systems), the pi calculus, the join calculus, etc.

Getting concurrency right is *hard*, and how to do it well is still an active research problem. Erlang is probably the most mature concurrent language. Java is how you DON'T want to do it. :)

While some me think it old

While some may think it old fashioned, I suggest looking at the Ada concurrency mechanisms (both task types and protected types).

For "old" read "time-tested"...

Along the same lines, you might want to look at the occam concurrency constructs (which can also be traced to the same roots in CSP as Ada's concurrency model). Both occam and Ada have been successfully used in the embedded domain for many years. Occam favors a model based around statically-defined process networks, which seems like it might fit well with Virgil's compile-time initialization approach.

On a more general note, you might want to take a look at Ed Lee's The Problem With Threads, which looks at concurrency from the perspective of embedded computing. It includes a good (if brief) overview of several different models of concurrency, and their pros and cons from the perspective of embedded systems (the discussion on "coordination languages" is perhaps less relevant to your immediate needs).