Parallel/Distributed

Tim Bray and Erlang

Erlang is getting a lot of attention these days (LtU was there first, of course: we had in depth discussions of Erlang several years ago). One recent discussion is around Tim Bray's experiments with Erlang. Steve Vinoski provides timing results showing the effect of utilizing Erlangs concurrency features on Tim's challenge.

Fair cooperative multithreading, or: Typing termination in a higher-order concurrent imperative language

Fair cooperative multithreading, or: Typing termination in a higher-order concurrent imperative language. September 2007. A preliminary version (without proofs, recursion and region-scoping) appeared in the proceedings of CONCUR'07, LNCS 4703 (2007), 272-286.

We propose a new operational model for shared variable concurrency, in
the context of a concurrent, higher-order imperative language à la ML.
In our model the scheduling of threads is cooperative, and a
non-terminating process suspends itself on each recursive call. A
property to ensure in such a model is fairness, that is, any thread
should yield the scheduler after some finite computation. To this end,
we follow and adapt the classical method for proving termination in
typed formalisms, namely the realizability technique. There is a
specific difficulty with higher-order state, which is that one cannot
define a realizability interpretation simply by induction on types,
because applying a function may have side-effects at types not smaller
than the type of the function. Moreover, such higher-order side-effects
may give rise to computations that diverge without resorting to explicit
recursion. We overcome these difficulties by introducing a type and
effect system for our language that enforces a stratification of the
memory. The stratification prevents the circularities in the memory that
may cause divergence, and allows us to define a realizability
interpretation of the types and effects, which we then use to prove the
intended termination property. Our realizability interpretation also
copes with dynamic thread creation.

Slides are available, not just the paper, and provide a nice introduction to approaches for proving termination, which may be useful in their own right to those not familiar with the topic.

Efficient, Correct Simulation of Biological Processes in the Stochastic Pi-calculus

Efficient, Correct Simulation of Biological Processes in the Stochastic Pi-calculus. Andrew Phillips, Luca Cardelli. September 2007.

In recent years, there has been considerble research on designing programming languages for complex parallel computer systems. Interestingly, some of this research is also applicable to biological systems, which are typically highly complex and massively parallel. In particular, a mathematical programming language known as the stochastic Pi-calculus has recently been used to model and simulate a range of biological systems...

...the prototype simulator presented in this paper will form the basis of the next release of the Stochastic Pi Machine.

SPiM, The Stochastic Pi Machine, is here, where you'll find tutorials and related information.

A nice introduction to the Gillespie algorithm can be found here.

Ralph Johnson: Erlang, the next Java

A nice blog post about Erlang. Nothing new here for those following the Erlang story, but a nice summary of several important themes none the less. Some choice quotes:

The thing that bugs me about [Armstrong's] book (and about his talks) is that he make more fuss than he should about the functional language aspect of Erlang and not enough about the OO aspect. In fact, he denies that it is OO.

Processes in Erlang are objects. At the beginning of my course on OO design, I explain three views of OO programming. The Scandinavian view is that an OO system is one whose creators realize that programming is modeling. The mystical view is that an OO system is one that is built out of objects that communicate by sending messages to each other, and computation is the messages flying from object to object. The software engineering view is that an OO system is one that supports data abstraction, polymorphism by late-binding of function calls, and inheritance. Erlang is a perfect example of the actor model, which is an example of the mystical view.

Moreover, given several kinds of processes that have a common protocol and that share some things in common, it is easy to factor out the commonality into a function that they can both call. This is similar to class inheritance. So, you could even say that Erlang supports inheritance...

Resources, Concurrency and Local Reasoning

Resources, Concurrency and Local Reasoning, Peter O'Hearn, 2007.

Resource has always been a central concern in concurrent programming. Often, a number of processes share access to system resources such as memory, processor time, or network bandwidth, and correct resource usage is essential for the overall working of a system. In the 1960s and 1970s Dijkstra, Hoare and Brinch Hansen attacked the problem of resource control in their basic works on concurrent programming. In addition to the use of synchronization mechanisms to provide protection from inconsistent use, they stressed the importance of resource separation as a means of controlling the complexity of process interactions and reducing the possibility of time-dependent errors. This paper revisits their ideas using the formalism of separation logic.

I have to admit that in posting this paper, I was at least slightly motivated by the imp of the perverse -- it's pretty much a commonplace among expert programmers that reasoning about programs that use shared memory, explicit mutual exclusion, and concurrency is very difficult. So the urge to post a paper showing one approach for doing it in a clean (and imo quite beautiful) way was irresistible.

The Karmasphere DP language

A friend of mine's asking for commentary on their project: The Karmasphere DP language

The Karmasphere DP language is a high-performance non-blocking
parallel language for performing data processing. It is designed
to give the user a high degree of control over the usage of system
resources, for example, how many CPU cores or how much disk I/O time
to use, without requiring the software developer to explicitly consider
these issues in code.

It was originally intended for collecting attributes of URLs and
domain names to be used in an anti-spam system, although it has since
developed into a full parallel programming language with many general
purpose operators.

It can take full advantage of multiprocessor (SMP or NUMA) systems,
and may be scaled sideways - since the interpreter and environment
are stateless, an entire cluster of machines may run the interpreter
in parallel without any requirement for synchronization.

In a quick conversation on IRC, the following was said:

Well, most people doing this kind of thing choose to play around with explicit continuations in the user language - which is totally technically correct, but from the user's point of view, impractical. Any language which requires a user to understand continuations or other theoretical concepts is the wrong answer. This language is designed to be simple to comprehend for the user who doesn't know anything about programming, even less parallelism.

Personally I find the relationship with GraphViz interesting.

edit: A demo is now available

Joe Armstrong DDJ interview

A short interview with Joe Armstrong about Erlang (naturally):

Ericsson has always used parallel hardware in its switching products -- right from the AXE in the mid 1970s. There is a "school" of parallel programming inside Ericsson. Erlang is the third language in a progression of languages: PLEX -> Eri-Pascal -> Erlang.

Multiscale Scheduling, Integrating Competitive and Cooperative Parallelism in Theory and in Practice

Multiscale Scheduling, Integrating Competitive and Cooperative Parallelism in Theory and in Practice.
G. E. Blelloch, L. Blum, M. Harchol-Balter, and R. Harper. February, 2007.

Our view is that the computing systems of the future exhibit parallelism multiple levels of
granularity, and that these resources must be shared among many users simultaneously. These combination of these two aspects, which we call multiscale parallelism, poses significant challenges for implementing next-generation systems...

We believe that the theoretical issues of multiscale scheduling cannot be properly addressed without carefully considering how these issues will work with particular applications and how they coordinate with the programming languages used to express the parallelism.

This proposed long-term research project (which as far as I can tell was not mentioned here before) is interesting on many levels. From a programming languages perspective the main idea seems to be extending the NESL language, which is based on data parallelism, with such concepts as speculative parallelism that mesh well with multilevel scheduling.

Evaluating High-Level Distributed Language Constructs

I saw this mentioned by one of the authors on the erlang-questions mailing list:

The paper investigates the impact of high level distributed programming language constructs on the engineering of realistic software components. Based on reengineering two non-trivial telecoms components, we compare two high-level distributed functional languages, Erlang and GdH, with conventional distributed technologies C++/CORBA and C++/UDP.

GdH = Glasgow distributed Haskell.

XML feed