Parallel/Distributed

A located lambda calculus

A located lambda calculus. Ezra Cooper and Philip Wadler. Submitted to ICFP 2008.

Several recent language designs have offered a unified language for programming a distributed system; we call these "location-aware" languages. These languages provide constructs that allow the programmer to control the location (the choice of host, for example) where a piece of code should run, which can be useful for security or performance reasons. On the other hand, a central mantra of web engineering insists that web servers should be "stateless": that no "session state" should be maintained on behalf of individual clients---that is, no state that pertains to the particular point of the interaction at which a client program resides. Thus far, most implementations of unified location-aware languages have ignored this precept, usually keeping a process for each client running on the server, or otherwise storing state information in memory. We show how to implement a location-aware language on top of the stateless-server model.

This paper is technical, and I assume most LtU members will mainly read sections 1, 5 & 6. Figure 5 is definition of the located LC.

Erasmus: A Modular Language for Concurrent Programming

A Modular Language for Concurrent Programming, September 2006, Technical Report by Peter Grogono and Brian Shearing.

How will programmers respond to the long-promised concurrency revolution, which now appears both inevitable and imminent? One common answer is "by adding threads to objects". This paper presents an alternative answer that we believe will reduce rather than add complexity to the software of the future. Building on the ideas of an earlier generation, we propose a modern programming language based on message passing. A module cannot invoke a method in another module, but can only send data to it. Modules may be constructed from other modules, thus permitting processes within processes. Our goal is to provide the flexibility and expressiveness of concurrent programming while limiting, as much as possible, the complexity caused by nondeterminism.

The principle innovations reported in the paper derive from bringing together ideas -- some well known, but others almost forgotten -- found in the historical software literature, and combining these ideas to solve problems facing modern software developers. In addition, at least one idea reported here appears to be novel, namely the introduction of an interface hierarchy based not on data elements or methods, but on path expressions, on the actual flow of control within a module. It is more natural to classify components of a process-oriented system by control flow rather than data content.

Another novel feature is the integration of unit tests into the source of each component, thus reducing the possibilities for testing to get out of step with development.

The project home page is here.

By cdiggins at 2008-04-02 14:44 | Parallel/Distributed | 32 comments | other blogs | 15692 reads

WaveScript

WaveScope is a system for developing distributed, high-rate applications that need to process streams of data from various sources (e.g., sensors) using a combination of signal processing and database (event stream processing) operations. The execution environment for these applications ranges from embedded sensor nodes to multicore/multiprocessor servers.

WaveScript is the programming language used to develop WaveScope applications. It is a high-level, functional, stream-processing language that aims to deliver uncompromising performance. WaveScript programs execute in parallel on multiple cores, or distributed across a network. Its compiler uses aggressive partial evaluation techniques to remove abstractions and reduce the source program to a graph of stream operators.

This came up in the discussion group and since it is cool (both the project and the language), and the other editors are mostly MIA, I thought I'd bring it to the attention of those who only follow the home page.

To get a taste of the language click here.

By Ehud Lamm at 2008-01-25 23:43 | Parallel/Distributed | 3 comments | other blogs | 8835 reads

Computation Orchestration: A Basis for Wide-Area Computing

Computation Orchestration: A Basis for Wide-Area Computing, Jayadev Misra and William Cook. JSSM 2006.

The widespread deployment of networked applications and adoption of the internet has fostered an environment in which many distributed services are available. There is great demand to automate business processe and workflows among organizations and individuals. Solutions to such problems require orchestration of concurrent and distributed services in the face of arbitrary delays and failures of components and communication.

We propose a novel approach, called Orc for orchestration, that supports a structured model of concurrent and distributed programming. This model assumes that basic services, like sequential computation and data manipulation, are implemented by primitive sites. Orc provides constructs to orchestrate the concurrent invocation of sites to achieve a goal -- while managing time-outs, priorities, and failure of sites or communication.

The idea of orchestration languages is one of the good ideas to come out of the web services world. Orc is an elegant little process-calculus-inspired programming language that illustrates and embodies the key ideas, and I'd recommend studying it to anyone who has even a passing inclination to design languages or libraries for "mashup" style programming.

Concurrency: The Compiler Writer's Perspective

A short interview with Brian Grant (of PeakStream and currently Google). From SD Times, Nov. 15, 2007.

SD Times: Is concurrency too difficult for developers accustomed to linear programming to grasp?

Brian Grant: It is challenging, but I donâ€™t think itâ€™s too challenging. I do think certain languages, especially C and C++, make it very challenging to write robust big multithreaded systems. Basically, they have features that are hostile to concurrency.

Does your experience in compilers give you a different perspective than the average developer?

[...]
Higher-level languages favor ease of correctness over ease of performance. For example, in functional languages such as Haskell, the code you write does not have any side effects. The model is that they donâ€™t modify existing data; they generate new data. The advantage is that itâ€™s easier for the compiler to reason about what different components of the code can do.
[...]

The Carnap Programming Language

Carnap is a general purpose programming language for the next generation of many-core devices, many many-core systems and their applications. It introduces a process oriented programming model that allows programmers to separate the concerns: Carnap programs consist of data structures and the concurrent processes that act upon them.

Â§2 "The primitive process of a Carnap program is called an action. An action determines a local or shared state. Actions are assembled by construction to form the component processes of a program. Programs consist of concurrent processes that construct and interact via logically shared data structures and resources called Contexts.

Â§3 In this way the application programmer is able to separate concerns, reasoning separately about the two primary aspects of Carnap programs: potentially large scale data structures and the concurrent processes that act upon them.

Â§4 Contexts are named, type associative and statically typed.

Carnap was mentioned here before, but I think that the website provides more information now, specifically there are now some example programs to look at.

By Ehud Lamm at 2007-11-24 23:41 | Parallel/Distributed | 4 comments | other blogs | 7400 reads

binpac: A yacc for Writing Application Protocol Parsers

binpac: A yacc for Writing Application Protocol Parsers.

R. Pang, V. Paxson, R. Sommer, and L. Peterson. ACM Internet Measurement Conference. October 2006.

A key step in the semantic analysis of network traffic is to parse the traffic stream according to the high-level protocols it contains. This process transforms raw bytes into structured, typed, and semantically meaningful data fields that provide a high-level representation of the traffic. However, constructing protocol parsers by hand is a tedious and error-prone affair due to the complexity and sheer number of application protocols. This paper presents binpac, a declarative language and compiler designed to simplify the task of constructing robust and efficient semantic analyzers for complex network protocols. We discuss the design of the binpac language and a range of issues in generating efficient parsers from high-level specifications. We have used binpac to build several protocol parsers for the "Bro" network intrusion detection system, replacing some of its existing analyzers (handcrafted in C++), and supplementing its operation with analyzers for new protocols. We can then use Bro's powerful scripting language to express application-level analysis of network traffic in high-level terms that are both concise and expressive. binpac is now part of the open-source Bro distribution.

Binpac nicely abstracts away issues such as large numbers of concurrent, asynchronous parsing processes and protocol specifics (such as HTTP's chunked encoding). A parser for a large part of HTTP is presented in the paper and fits on half a page. The authors have also written parsers for CIFS/SMB, DCE/RPC, DNS, NCP, and Sun/RPC.

It's Time to Stop Calling Circuits "Hardware"

F. Vahid. It's Time to Stop Calling Circuits "Hardware". IEEE Computer Magazine, September 2007.

The advent of field-programmable gate arrays requires that we stop calling circuits â€œhardwareâ€
and, more generally, that we broaden our concept of what constitutes â€œsoftware.â€ ...
Developing modern embedded software capable of executing on multiprocessor and FPGA platforms requires expertise not just in temporally oriented modeling (W comes after X) like writing sequential code but also in spatially oriented modeling (Y connects with Z) like creating circuits.

An interesting take on where programming should be heading in the future -- and consequently, where programming languages should also be heading. This article is somewhat related to the recent discussion here on LtU about FPGA CPUs. As the excerpt above illustrates, Vahid draws a distinction between what he calls "temporally-oriented" computing, which focuses on sequence, and "spatially-oriented" computing, which focuses on connectivity of components. His basic argument is that traditional programming languages (and traditional programming education) focus on temporally-oriented computing, but that the growing use of FPGAs as an integral part of many systems (particularly embedded systems) necessitates a greater emphasis on programming in a spatially-oriented mode. We don't tend to talk too much about "hardware description" languages like VHDL and Verilog here on LtU, but perhaps they are the answer (or at least part of the answer) to Ehud's recent question about which languages we should be discussing to "stay ahead the curve".

Derivation and Evaluation of Concurrent Collectors

Derivation and Evaluation of Concurrent Collectors, Martin T. Vechev, David F. Bacon, Perry Cheng, and David Grove. ECOOP 2005.

There are many algorithms for concurrent garbage collection, but they are complex to describe, verify, and implement. This has resulted in a poor understanding of the relationships between the algorithms, and has precluded systematic study and comparative evaluation. We present a single high-level, abstract concurrent garbage collection algorithm, and show how existing snapshot and incremental update collectors, can be derived from the abstract algorithm by reducing precision. We also derive a new hybrid algorithm that reduces floating garbage while terminating quickly. We have implemented a concurrent collector framework and the resulting algorithms in IBMâ€™s J9 Java virtual machine product and compared their performance in terms of space, time, and incrementality. The results show that incremental update algorithms sometimes reduce memory requirements (on 3 of 5 benchmarks) but they also sometimes take longer due to recomputation in the termination phase (on 4 of 5 benchmarks). Our new hybrid algorithm has memory requirements similar to the incremental update collectors while avoiding recomputation in the termination phase.

The Manticore Project

Status Report: The Manticore Project. Along the lines of Concurrent ML comes a new language design that is aimed at the parallelism of multicore CPUs.

The Manticore project is an effort to design and implement a new functional language for parallel programming. Unlike many earlier parallel languages, Manticore is a heterogeneous language that supports parallelism at multiple levels. Specifically, we combine CML-style explicit concurrency with fine-grain, implicitly threaded, parallel constructs. We have been working on an implementation of Manticore for the past six months; this paper gives an overview of our design and a report on the status of the implementation effort.

Interesting material in terms of approach and implementation based on an ML style language (sans mutable variables). But the underlying assumption that language design offers the best path to solving parallelism is probably the key appeal for LtU:

The problem posed by such processors is how to effectively exploit the different forms of parallelism provided by the hardware. We believe that mechanisms that are well integrated into a programming language are the best hope for achieving parallelism across a wide range of applications, which is why we are focusing on language design and implementation.

And as long as we're on the subject, I see that Reppy's book on Concurrent ML is now in paperback form (moving it more into my price range).

Lambda the Ultimate

User login

Navigation