Language Support for Fast and Reliable Message-based Communication

Language Support for ... (pdf)

"... using advanced programming language and verification techniques, it is possible to provide and enforce strong system-wide invariants that enable efficient communication and low-overhead software-based process isolation. Furthermore, specifications on communication channels help in detecting programmer mistakes early—namely at compile-time—thereby reducing the difficulty of the message-based programming model."

(And ppt slides)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

static verification and async

I was psyched by the title, since message based comms seem the way to go, as Erlang demonstrates (and the paper nods at Erlang). But I spent no more than three minutes scanning text, looking for the novel part; I found it in the concluding section 8, whose first two sentences are cited below for folks who want a fast look:

8. Conclusion To our knowledge, this work is the first to integrate a full-fledged garbage-collected language with statically verified channel contracts and safe zero-copy passing of memory blocks. The main novelty of the work is the enabling of simple and efficient implementation techniques for channel communication due to the system-wide invariants guaranteed by the static verification.

This explains why I didn't stop and read middle parts more slowly, since it looked fairly conventional in memory management for message passing. (It's all in wrists ...I mean, all in heap definition.) The static verification part kills it for me — my interest in types ends with adding minimal optional typing to dynamic languages. I'd find novelty in fun memory management. Maybe it's hard to find new ground there — at least without doing something rather complex, which is its own punishment.

However, emphasis is well placed on integrating message passing at the language level — as opposed to just in complex add-on libraries. Even so, I had trouble seeing async based message APIs, which I think are more important than synchronous APIs. Did I just miss them? Or doesn't this paper care about async? Oh, I see; now that I search, I find "async" only appears under keywords on the first page, and under future work in the conclusion.

didn't stop and read middle parts more slowly

"static verification part kills it for me"
Maybe LtU should split forums static/dynamic then we could just look at like-minded stuff :-)

'I find "async" only appears...'

... the OCCAM channels are synchronous, uni-directional, and uniformly typed, whereas the channels described here are asynchronous, bi-directional and governed by channel protocols.

Your final comment might be read as suggesting that "async" isn't done but is "future work", that wasn't my reading of -

... we are investigating how to build abstractions over channel objects and communication while retaining their asynchronous nature and the static checking of the protocol

- that final sentence from the paper really didn't tell me anything at all.

The Singularity guys are

The Singularity guys are using static checking for two reasons. First, they want different processes to be able to use entirely different garbage collectors. This means that they have to do something special with the memory allocated to messages, because that's data two different memory management regimes will have to interact with. Second, they want to use a micro-kernel architecture, but to make that work they have to really radically reduce the cost of messaging. So they want to eliminate the use of hardware memory protection and context switches when they send messages, without giving up memory safety.

In a "dynamic" setting, if you want memory protection for processes, this means you have to either make memory allocation part of the core system (such as with Smalltalk or Lisp machines), or you have to do hardware memory protection (as with Unix) and protect processes in a heavyweight way.

By using static checking, they can guarantee strong invariants about the memory common to processes (ie, message data), and this lets them have their cake and eat it too.

the Unsafe Code Tax

neelk wrote "... eliminate the use of hardware memory protection and context switches ...
Which they've labeled "the Unsafe Code Tax"

(section 4.2.1 Singularity: Rethinking the Software Stack April 2007 pdf )

I like the sound of that

neelk: First, they want different processes to be able to use entirely different garbage collectors.

Awesome! No I mean it: that's a cool idea, and respectable when it hints at possibly allowing non-lockin solutions in other processes. (Of course, they might still plan to write all garbage collectors in other processes, forcing withdrawn respect.) Being able to separate things that needn't be tied together is very helpful in many things, since substitution is a basic diagnostic tool.

Desire to cut costs is a given, and touching memory is a major cost in recent years when memory latency hasn't decreased as fast as clock cycles increased. So pass-by-reference becomes a bigger payoff, even though sharing causes headaches. However, if shared space is immutable, then you can't tell it's shared, and even code running in the same physical process can't tell sharing happens in different logical processes. (I've talked about this before, but not very clearly.)

Just because they're solving completely predictable problems doesn't mean it's not hard. Memory management is a pain. I was just planning to mark memory immutable, and let processes blow themselves up if they tried to modify immutable state. (This is a kind of "kill them all and let the coder sort them out" design style, putting the burden of eventually meeting constraints on the programmer, rather than forcing the OS to protect itself from predations of coders.)

So pass-by-reference becomes

So pass-by-reference becomes a bigger payoff, even though sharing causes headaches.

See this recent discussion of a PLT Scheme paper which allows memory accountability with full sharing. This means that separate heaps and separate GCs and the exchange heap for IPC as in Singularity aren't necessary to properly account for memory.

Statically verified channel contracts?

In the conclusion:

To our knowledge, this work is the first to integrate a full-fledged garbage-collected language with statically verified channel contracts and safe zero-copy passing of memory blocks.

However, I do not see any details about the statically verified channel contracts. The only thing I saw was:

Send methods never block and only fail if the endpoint is in a state in the conversation where the message cannot be sent.

...which would be a runtime check, not a static check. That statement leads me to believe that the channel contracts do not offer any stronger guarantees than any other IDL compiler-like system.

Furthermore, I do not see a big deal with multiple garbage collectors here---there is one per process, the same as any other system, and a manually-allocated global heap for message storage. That the performance of IPC is better than on systems that do have to cross VM boundaries is unsurprising, but that was the point of an earlier paper, if I remember correctly.

The future work also describes:

We are also developing extensions to the channel contracts that enable static prevention of deadlock in the communication of processes.

...which would be neat, but not terribly so since their channel contracts are basic state machines.

the benefit of the doubt

The nice thing about being hopelessly out of my depth when reading this stuff is that when I do not see the details I'm still inclined to give the authors the benefit of the doubt.

Given that the authors repeatedly talk about statically verified channel contracts, we can be confident that they think that's what they have implemented - which just leaves the minor details of whether we agree about what that means, and if we do agree then how did they do it ;-)

I guess (and I mean guess) that "session types" is a hint:

"Channel communication is governed by statically verified channel contracts that describe messages, message argument types, and valid message interaction sequences as finite state machines similar to session types [17, 21]." 1 INTRODUCTION

and hints about applying static verification of explicit resource use

"The rules described in this section for handling blocks in the exchange heap allow a compiler to statically verify that ... 3) send and receive operations on channels are never applied in the wrong protocol state." 3 RESOURCE VERIFICATION

and

"... the static tracking ensures the following system-wide invariants:
* The sequence of messages observed on channels correspond to the channel contract." 3.2.1 Discussion

and as a wild-ass-guess, maybe they are building on previous work like "The Fugue protocol checker"?

static prevention of deadlock

Here's a pre-publication paper Static Deadlock Prevention in Dynamically Configured Communication Networks (pdf)

"The technique is based on statically approximating all potential wait cycles that could arise at runtime. The potential cycle information is computed from the local perspective of each process, leading to a modular analysis.
We prove the soundness of the analysis (namely dead-lock freedom) on a simple but realistic language that captures the essence of such communication networks."