AsyncScala: DSL for coordinating asynchronous processes in Scala

I would like to present my project AsyncScala:

The project is based on Actor model, but there are some twists that it make different from most of the implementations of Actor model (save E from which I borrowed the core ideas and new EcmaScript effort by Mark Miller) that I know.

Composeable asynchronous components in one event loop

In most actor systems one asynchronous component = one event loop. And if take traditional receive operation, it knowns about all events and handles it. As result, the components could exchange only messages, and for shared memory interactions they have to stick to transactional memory and similar things.

However, there is another example of event loop programming. It is GUI system. In them, many asynchronous components share single event loop. In GUI event loop a wonderful thing happens. The handling of events is modularized. So different components handle each own events. Just consider what efforts would have been required, to build an actor that uses receive operation to handle events on GUI event loop. But with modularized event handling, event half-trained VB programmers could achieve this. And within event loops it safe to share memory.

Like E, AsyncScala also uses this approach, many DSL nice features would not have been implementable in a sane way without safe memory sharing.

Composeable asynchronous operations

This idea is combination of the idea of composeable asynchronous when operator and of idea of structured parallel programming in Occam.

The problem with traditional actor model is that event sends are considered as atomic operation. The event send and and received. And it is much easier than mutexes. However, event driven program usually requires inversion of control and becomes hairy with the time. The reason is that after receiving most of the events we have to restore the context and to correlate events to operations.

This is just like the situation was with gotos in sequential programming, that was nicely described by Edsger Dijkstra's in his letter "Go To Statement Considered Harmful" (there is a nice analysis if the letter). The solution that he has argued for is that flow of program text should correspond to the flow of control.

The framework extends this idea to asynchronous programming. In event-driven programming, one-way event send is an asynchronous equivalent of goto operator. So the framework firstly defines what is an asynchronous operation and than provides a number of operators that combine the asynchronous operations together. And these operators are built upon one-way message sends, like loops, function calls, and conditional operators in sequential and functional programming are built upon ifs.

So in the framework, the flow of program text much more closely corresponds to the flow of events in the system than in many other implementations of Actor model. The operators from framework are also described at here in more abstract form.

There are also Groovy and Java versions of frameworks in Git, but they need some fixing to shape them up.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Vats

I'm also using vat semantics inspired from E for my RDP model, with great success.

I'm using a temporal vat model, which has a lot of nice properties. In my model:

  • Each vat has a logical time (getTime).
  • Vats may schedule events for future times (atTime, atTPlus).
  • Multiple events can be scheduled within an instant (eventually).
  • Vats are loosely synchronized. Each vat sets a 'maximum drift' from the lead vat.
  • No vat advances past a shared clock time (typically, wall-clock). This allows for soft real-time programming.

This model is designed for scalable, soft real-time programming. The constraints on vat progress give me an implicit real-time scheduler (albeit, without hard guarantees), while allowing a little drift between threads (e.g. 10 milliseconds) can achieve me an acceptable level of parallelism on a multi-core machine.

Further, timing between vats can be deterministic if we introduce explicit delays based on the maximum drift (i.e. send a message of the form 'doSomething `atTime` T' where T is the sum of getTime and getMaxDrift.

Anyhow, I ended up rejecting the equivalent to your later(vat) and send(vat)) operations. The reasons: validation, verification, decoupling. I feel developers should be able to isolate failure and performance concerns to particular vats, or transparently divide or recompose the program into more or fewer vats. This requires the ability to control which code runs in each vat, and means we can't be naming vats directly.

Developers may 'spawn' a new vat with an initial task, though.

To communicate between vats without sharing arbitrary code, I introduced a pair of typeclasses (for Haskell):

class VatCB v where
  type Method v 
  method :: (a -> v ()) -> v (a -> Method v)
class Callable m method where
  call :: method -> m ()

The hidden 'Method' type (using type families) allows me to share a method between vats while restricting arbitrary code with its associated risks of failure, delay, or divergence.

If developers do want the ability to share arbitrary code, they can achieve that easily enough: method id would allow executing arbitrary code in another vat, and they could share this method with trusted vats. However, the developers of each vat have effective control over this.

[edit: I've recently extended these ideas to support replies and pipelining more uniformly.]

I do not have such real-time

I do not have such real-time constraint and verification requirement on my list. The project is targeted to IO-bound parts of server-side and client-side applications.

Also I have read some KeyKOS (and maybe EROS papers), that I have seen something along the lines that you are doing, you might be interested to check them.

later(vat) and send(vat) are not supposed to be used widely in user-level code anyway. However, they are used heavily by infrastructure layer. Also they are used by code that creates vat.

Normally, the component invokes some code on other vat using asynchronous proxy that sends message to the vat using send(vat) or later(vat) internally.

With promises, there are some dataflow-based verification could be done, but that is certainly weaker than your soft-realtime requirements. If you have links to some papers on your work, it would be nice to see too.

Infrastructure Layer

I use something like those under-the-hood, too, to actually implement VatCB and RDP. But I think it's often worth keeping such infrastructure details out of the public interface.

These operators are

These operators are practically unavoidable when integrating with "legacy" API (blocking or not). And I have a lot of it in rest of the Scala. For example, it is sometimes needed to use send(vat) from non-vat threads (like timer thread) to notify the components in the vat. Also if user will need to add some exotic infrastructure component (like work management or integration with Java EE services), these operators will be needed as well.

Legacy API integration

I do agree with the need to notify from 'legacy' APIs.

VatCB achieves this. Concretely, I provide an instance for Callable IO (Method RDPIO), where RDPIO is my vat type. This allows me to call RDPIO methods from arbitrary IO threads. To speak true, legacy integration is the primary reason I developed VatCB, since I intend communication between vats occurs primarily via my RDP model (which is not suitable for normal IO threads).

Yet VatCB remains friendly with respect to safety, modularity, security, and verifiability. I leverage the type-system to constrain the data and operations via method call. Capability security allows me to attenuate or control who has access to which methods. Methods don't reveal whose vat they use, so are quite modular and leave developers free to rearrange the vat structure of the program. I compose methods via Data.Monoid.

By comparison, send(vat) allows anyone with a reference to the vat to execute arbitrary loops or accidentally share references to objects and state maintained by other vats.

While send(vat) does fulfill the role of legacy integration, I feel VatCB is a much superior option. And I'm certain you could, without much effort, achieve similar constraints in AsyncScala.