[Ask LTU] How to implement concurrent languages ?

Hi all,

I used EOPL and TAPL to learn to write interpreters for simple languages and found both books tremendously helpful.

I've been using a custom scheme-like language (generating C code )for some of my consulting work with great success.

Now I am trying to learn how to implement languages with concurrency constructs (threads, actors, message passing concurrency etc). I could try to hack the source code of an existing concurrent language, but I was wondering if there was a book to learn from ("Implementing Concurrent Languages"? :-) ), along the lines of EOPL, explaining the various issues and tradeoffs on *implementing* concurrent languages. But this maybe too much to expect, I am just looking for a starting point.

Amazon has a few books on concurrent programming, but I couldn't find any that focussed on *implementation* of concurrent languages.

Any pointers (books, papers, interpreters to hack etc) greatly appreciated. LTU was very helpful the last time I asked a question (on how to write an FFI), hence this question. (Editors, If this question is not appropriate, please delete it )

*Any* information will be gratefully accepted. Help!! :-)

Thanks in advance,


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.


To add to my post, for anyone interested in the same issues.

EOPL (edition 1, which I have) does explain the use of continuations to implement co routines.

The simplest example ?

Can you first point out the pages on EOPL-1 where he explains it ?

Secondly, I am coming from Forth and want to know how it implements the multitasking wrapper when someone writes independently running codes. Basically, forth is a multitasker which gives resource sharing and creates Virtual Concurrency.

I prefer some example C code which implements this without using the multitasking facilities of the OS, perhaps runs in DOS. Staged examples so that the OS is something more than an outer loop and actually switches between tasks. I wonder how the cycles are allocated and how interrupts are managed.

I am a newbie so I need simpler material, the kind of simple things published in forth journals.

I can also be contacted by email at stndshp at gmail.com if someone wants to contact me.



Chapters 2, 4 and especially 8 give theoretical information about concurrency.

An interpreter to hack on

My language, Rhope, might be worth taking a look at as an example. Source code for the interpreter is available under the BSD license on the downloads page. The code is rather lacking in comments, but I'd be more to point you in the right direction or answer any questions you have.

Rhope uses the dataflow paradigm for implicit parallelism (i.e. operations that are not dependent on each other operate in parallel automatically). This is currently implemented with a fixed-size thread pool pulling from a queue of ready operations. Currently, this is done at too fine grained a level for it to be terribly efficient, but I believe it will work well once I move to a VM-based interpreter with coarser scheduling.

Most operations have pass by value semantics (internally copy on write if the refcount > 1) so locks aren't needed in user code. Operations that have side effects (like I/O) use locks internally when needed. Sequencing of independent operations can be accomplished by creating a sort of false dependency (internally, operations can have a null input that is used for determining readiness, but not for actually passing data to the operation). Unfortunately this can only be done if the operations to be executed in sequence return at least one value (this might be fixed in future versions).

Global state is handled with a sort of transaction mechanism. Subroutines (currently called workers in Rhope's terminology) that need access to global state are declard to use a global store (sort of like a namespace, but only contains global variables not workers or types). Workers that only read from a global store get unfettered access (at some point a user might be able to restrict that for a specific global store). Workers that perform writes cause a new transaction to be created when the worker is invoked. When the worker exits the interpreter attempts to commit the changes back to the global copy. If the store hasn't been changed since the transaction started the commit succeeds, otherwise the worker is restarted with a fresh transaction.

The current transaction mechanism isn't appropriate for everything though. Starvation is a potential issue when there are lots of simultaneous commits and the mechanism isn't appropriate for protecting I/O operations so I'm probably going to add some options for using reader/writer and regular locking that can be set on a per-store basis.

Some links you may find of interest

I asked LtU about Scheme implementations that take advantage of multiple-processor hardware a while back

You'll find there links to articles about different Scheme implementations, with different approaches to concurrency each (MultiLisp, Termite, KaliScheme, PolyScheme, CrystalScheme, ConcurrentScheme).

Hi Ravi, I don't know of

Hi Ravi,

I don't know of any books on specifically on implementing concurrent languages. However, there are a number of very good books and articles about the subject. Here are a handful that I particularly like.

Koen Classen, A Poor Man's Concurrency Monad
William Harrison, Cheap (but Functional) Threads

These two papers explain how to implement a concurrency monad in terms of resumptions. They don't explain how to multiplex logical threads onto machine threads, but they explain everything you need to learn in order implement green threads. This will provide a bunch of useful context to understand the next pair of papers.

Olin Shivers, Continuations and threads: Expressing machine concurrency directly in advanced languages.
Peng Li, Simon Marlow, Simon Peyton Jones and Andrew Tolmach, Lightweight Concurrency Primitives for GHC.

Olin's paper has no math, and very little code, but it will give you the conceptual insight you need to understand what to do to take a green thread implementation and turn it into a "true" threading implementation. Li & co.'s paper works out all of these ideas explicitly, for the case of Haskell.

John Reppy, Concurrent ML.
Peng Li, Steve Zdancewic. Combining Events and Threads for Scalable Network Services.

Once you have the infrastructure, you need an API. John Reppy's book on Concurrent ML fully explains an extremely high-quality threading API. (It also offers a concrete implementation of a green thread infrastructure based on call/cc to read.) Li and Zdancewic explain another API, which builds on the CML work to even spiffier effect.

(As an aside, I like your blog very much.)

Thank You All

for the great comments.

I'll take a look at Rhope! Thanks!

Neel, that's a great list of papers/books. I think they should be sufficient to give me a grip on the issues involved.

"(As an aside, I like your blog very much.)"

Hey Thanks! I find your posts (on LTU, reddit and elsewhere) very informative!


More one book

I'm late but I recommend this book:
Concurrent Programming: Principles and Practice
by Gregory R. Andrews
It is a very complete book and presents many concurrent techniques for a Pascal like language (semaphores, conditional critical regions, monitors, asynchronous and synchronous message passing, RPC and rendezvous). At the end of each chapter presents some implementation techniques. I didn't implement anything, so I'm not sure how good is for implement a new programming language, but I remember that there was a continuity, it presented how to extend the kernel of the language assuming the previous mechanisms and it was very nice to know how things could be done.

Concurrent Programming: Principles and Practice

I have this book. On first glance it doesn't seem to tell me how to *implement* the concurrency features (as you point out), but it looks like a nice book otherwise, with lots of exercises (I love working through exercises! :-)).

Maybe there is some implementation advice in it. If there is, I'll post an outline here.
Thanks Again,


When you say "implement" I am not sure I know what you have in mind. There are several levels of abstraction between the language construct and the cpu, and you probably want to understand all of them - but you may still decide to base you implementation on an existing VM or use OS concurrency features etc. So what level are you interested in?

If you are interested in a high level perspective than a continuation based implementation (covered in EOPL2 for example) may serve your needs. I also recommend the early paper showing how Erlang was implemented via a meta-ciruclar prolog interpreter.

However, I sense you are interested in low-level details, which are mostly outside the scope of language design. Be that as it may, reading on the design decision that led to the concurrency features of Ada may be of interest, since the language was designed with specific attention to the efficiency cost of implementing the concurrency features. The Ada realtime working group produced useful documents about these issues that may be of interest to you.

Ehud, Good Question

I am interested in a "middle level" perspective (lower than continuations, higher than hardware)

I assume I have an operating system (say linux for the sake of this discussion) with processes.

I want to build a VM based concurrent language (say a mini erlang, or a language with threads, like Java) with garbage collection. How does one go about (attempting) this? If I am implementing a language should I go with message passing or threads? What are the trade offs? What kind of VM/byte code does each decision result in ? These are the questions I was interested in.

I want to learn how to design the language (and build the VM, which I consider "language design", but I could be wrong) so that the semantics of my designed language map to the underlying OS primitives. Part of this is understanding the details of the OS primitives and part (I think) is about the VM design.

EOPL and L.I.S.P take the approach of deriving the bytecode/VM from an extremely simple interpreter. I find this a very compelling approach and was wondering how one would do this for a language spec which has concurrency features (assuming an OS). Maybe this is not possible. I don't know , hence the question :-)

If the question was not appropriate to LTU, I apologize and promise to desist. After reviewing the policy docs, I realize these questions are rather more "concrete" than the theory focus of the LTU would warrant.

Thank You for the reference to the ADA design decisions.

I would start with something

I would start with something like Erlang - "processes" with message-passing. Unfortunately if your language has assignment there is no way (that I know of) of guaranteeing isolation of processes within the same OS process without deep-copying every mutable object that newly spawned threads can access (this might be very expensive). You could choose to leave it for the programmer to deal with. Maybe some LtU-ers know about some clever type system which can deal with this problem.

Process Algebra based concurrency

Quite some time ago, I have implemented Process Algebra based concurrency as extensions to Pascal, Modula, C, C++ and Java; for the latter extension, see the Google Code project. Warning: the code is old, and of type "don't worry, be crappy".

I am working on an improved follow-up language, that now extends Scala. The language definition and examples are again at Google Code. There is a page on the implementation model. This shows an Hello World example with a sequential operator (first Hello, then World), but parallel operators work quite similar.

I have recently started the implementation; I hope to upload some sources in a few weeks.

BTW I am looking for contributors. Please contact me if you are interested.