archives

Models for distributed parallelism

I've been reading left and right (including on this forum) looking for models of parallel computing that are relevant to what I'm dealing with, and frankly not finding much. I get the impression that most theory of parallel computing is about shared memory, and so a bunch of old ideas for SMPs gets recycled now that we have multi-core chips (http://lambda-the-ultimate.org/node/3963). However, in the early 90s I had 256 distributed memory processors, a couple years later 2048, when I started my current job 10,000 cores or so, and a year from now I'll have half a million cores. And most of research into parallelism that I've seen is pretty much irrelevant there.

A typical parallel computation has a billion or so data points, does some data parallel operation (a very complicated type of local averaging) on them, computes some global condition, and then repeats this. In terms of "sequential consistency" (trivial: memory is distributed), completion (guaranteed), resources (none: everything is in memory), &c this is all trivial. On the other hand, moving data around is very expensive and the key to how well your algorithm performs.

And I'm not finding any models that explicitly talk about moving data. Do they exist?

My current thinking is that I model an algorithm as a sequence of BSP-like supersteps, and each superstep is a bipartite graph: the input data set, output data set, and the edges are the (data parallel) instructions that take the one to the other. The big point is that I can now talk about how the three sets (input, output, instruction) are distributed, so I have a formal way of talking about communication.

Does this sound remotely familiar to anyone? Are there models of parallel computation that are relevant to scientific computing?