Parallel/Distributed

Intel Concurrent Collections for Haskell

Intel Concurrent Collections...provides a mechanism for constructing [programs] that will execute in parallel while allowing the application developer to ignore issues of parallelism such as low-level threading constructs or the scheduling and distribution of computations. The model allows the programmer to specify high-level computational steps including inputs and outputs without imposing unnecessary ordering on their execution... Data is either local to a computational step or it is explicitly produced and consumed by them. An application in this programming model supports multiple styles of parallelism (e.g., data, task, pipeline parallel). While the interface between the computational steps and the runtime system remains unchanged, a wide range of runtime systems may target different architectures (e.g., shared memory, distributed) or support different scheduling methodologies (e.g., static or dynamic).

In short CnC purports to be a "a graph-based data-flow-esque deterministic parallel programming model". An open-source Haskell edition of the software was recently released on Hackage.

A series of blog posts describe the implementation: #1, #2, #3, #4, #5 (the last post links to a draft paper).

Personally, I would have preferred a more concise and down to earth description on the whole thing. If you have experiences to share, please do.

The Resurgence of Parallelism

Peter J. Denning and Jack B. Dennis, The Resurgence of Parallelism, Communications of the ACM, Vol. 53 No. 6, Pages 30-32, 10.1145/1743546.1743560

"Multi-core chips are a new paradigm!" "We are entering the age of parallelism!" These are today's faddish rallying cries for new lines of research and commercial development. ... The parallel architecture research of the 1960s and 1970s solved many problems that are being encountered today. Our objective in this column is to recall the most important of these results and urge their resurrection.

A brief but timely reminder that we should avoid reinventing the wheel. Denning and Dennis give a nice capsule summary of the history of parallel computing research, and highlight some of the key ideas that came out of earlier research on parallel computing. This isn't a technically deep article. But it gives a quick overview of the field, and tries to identify some of the things that actually are research challenges rather than problems for which the solutions have seemingly been forgotten.

Brians functional brain

A rather fun and instructive series of posts about implementing a cellular automaton in idiomatic Clojure, and making it run fast (and in parallel).

Be sure to read the second part, and the errata.

The AI Systems of Left 4 Dead

There's no PL content per se in this presentation, but a PL weenie will surely think of a DSL for almost every slide, so I hope posting this is warranted.

Tim Sweeney has written previously on the programming challenges raised by game development (The Next Mainstream Programming Languages: A Game Developer's Perspective), and I think this presentation is another showcase of the huge problems that need solving in game development.

Finally, I wonder why anyone would use a language that doesn't allow quick and simple syntax extension for driving a game engine? Seriously, the possibilities for ad-hoc DSLs seem endless.

Developer Guided Code Splitting

Google Web Toolkit, which compiles Java to JavaScript for running code in the browser, now includes Code Splitting, for reducing application download time:

To split your code, simply insert calls to the method GWT.runAsync at the places where you want the program to be able to pause for downloading more code. These locations are called split points.

A call to GWT.runAsync is just like a call to register any other event handler. The only difference is that the event being handled is somewhat unusual. Instead of being a mouse-click event or key-press event, the event is that the necessary code has downloaded for execution to proceed.

Tim Bray on Clojure and Erlang

A short comparison (plus some links) of Erlang and Clojure solutions to the simple problem of running a counter in a separate thread.

Parallel Performance Tuning for Haskell

Parallel Performance Tuning for Haskell. Don Jones Jr., Simon Marlow, and Satnam Singh.

Parallel Haskell programming has entered the mainstream with support now included in GHC for multiple parallel programming models, along with multicore execution support in the runtime. However, tuning programs for parallelism is still something of a black art. Without much in the way of feedback provided by the runtime system, it is a matter of trial and error combined with experience to achieve good parallel speedups. This paper describes an early prototype of a parallel profiling system for multicore programming with GHC. The system comprises three parts: fast event tracing in the runtime, a Haskell library for reading the resulting trace files, and a number of tools built on this library for presenting the information to the programmer. We focus on one tool in particular, a graphical timeline browser called ThreadScope. The paper illustrates the use of ThreadScope through a number of case studies, and describes some useful methodologies for parallelizing Haskell programs.

A Java Fork/Join Framework

Doug Lea: A Java Fork/Join Framework, Proceedings of the ACM 2000 conference on Java Grande.

This paper describes the design, implementation, and performance of a Java framework for supporting a style of parallel programming in which problems are solved by (recursively) splitting them into subtasks that are solved in parallel, waiting for them to complete, and then composing results. The general design is a variant of the work−stealing framework devised for Cilk.

This work is about to be incorporated into Java 7 as jsr166y:

Parallel*Array (often referred to as PA) and its planned follow-ons for sets and maps, provide an easier/better way of routinely programming to take advantage of dozens to hundreds of processors/cores: If you can think about a programming problem in terms of aggregate operations on collections of elements, then we can automate parallel execution. This generally pays off if either you have lots of elements, (in which case, it works well even if the operations are small/cheap), or if each of the operations are time consuming (in which case it works well even if there are not a lot of elements). To take advantage of this though, the aggregate processing must have a regular structure, which means that you must be able to express things in terms of apply, reduce, filter, map, cumulate, sort, uniquify, paired mappings, and so on.

Branching Time vs. Linear Time: Semantical Perspective

Sumit Nain and Moshe Vardi, Branching Time vs. Linear Time: Semantical Perspective, invited ATVA'07 paper.

...this paper puts forward an, admittedly provocative, thesis, which is that process-equivalence theory allowed itself to wander in the “wilderness” for lack of accepted guiding principles. The obvious definition of contextual equivalence was not scrupulously adhered to, and the underspecificity of the formalisms proposed led to too many interpretations of equivalence.

In revisiting the notion of process equivalence, which is a fairly central part of concurrency theory, Nain and Vardi end up arguing in favor of a purely trace-based notion of equivalence and the use of linear-time logics. This in turn leads to a rejection of bisimulation as a tool for establishing process equivalences:

The gist of our argument is that branching-time-based notions of process equivalence are not reasonable notions of process equivalence, as they distinguish between processes that are not contextually distinguishable. In contrast, the linear-time view does yield an appropriate notion of contextual equivalence.
...

In spite of its mathematical elegance and ubiquity in logic, bisimulation is not a reasonable notion of process equivalence, as it makes distinctions that cannot be observed.

They take pains to point out that they are not claiming that bisimulation or CTL should be abandoned or are not useful. Rather their emphasis is on the fact that bisimulation is not a contextual equivalence and is therefore not appropriate for establishing equivalence between (for example) a specification and its implementation. As they say in the conclusion of the paper:

While one may not realistically expect a single paper to overwrite about 30 years of research, a more modest hope would be for this paper to stimulate a lively discussion on the basic principles of process-equivalence theory.

Capabilities for External Uniqueness

Philipp Haller and Martin Odersky have submitted Capabilities for External Uniqueness to OOPSLA'09.

Unique object references have many important applications in object-oriented programming. For instance, with sufficient encapsulation properties they enable safe and efficient transfer of message objects between concurrent processes. However, it is a long-standing challenge to integrate unique references into practical object-oriented programming languages.

This paper introduces a new approach to external uniqueness. The idea is to use capabilities for enforcing both aliasing constraints that guarantee external uniqueness, and linear consumption of unique references. We formalize our approach as a type system, and prove a type preservation theorem. Type safety rests on an alias invariant that builds on a novel formalization of external uniqueness.

We show how a capability-based type system can be used to integrate external uniqueness into widely available object-oriented programming languages. Practical experience suggests that our system allows adding uniqueness information to common collection classes in a simple and concise way.

A prototype plugin for the Scala compiler can be found on the same page.

XML feed