Lambda the Ultimate

inactiveTopic Parallel-Concurrent Programming Dept.
started 3/24/2004; 3:38:43 PM - last post 4/8/2004; 3:57:02 PM
Mark Evans - Parallel-Concurrent Programming Dept.  blueArrow
3/24/2004; 3:38:43 PM (reads: 393, responses: 15)

I am wondering whether LtU should have a parallel-concurrent programming department. There are just so many nice projects, for example, Cilk.

I won't lobby for a real-time department or an embedded programming department, but will just dream about them in my sleep.

Ehud Lamm - Re: Parallel-Concurrent Programming Dept.  blueArrow
3/25/2004; 2:46:30 AM (reads: 332, responses: 0)
How about a deparment dedicated to parallel/distributed programming?

David B. Wildgoose - Re: Parallel-Concurrent Programming Dept.  blueArrow
3/25/2004; 6:30:32 AM (reads: 319, responses: 0)
I would support that idea. Sorry to sound like a scratched record, but as multiple multi-core processors become the norm, every language is going to have to grapple with parallelism.

Mark Evans - Re: Parallel-Concurrent Programming Dept.  blueArrow
3/25/2004; 4:37:27 PM (reads: 289, responses: 0)

Yes Ehud, sorry I was not more explicit. The idea here includes distributed programming. Give the department some name that makes the most sense to you.

Actually the evolution of all these terms is a bit curious. "Concurrent" seems the new vogue, "parallel" the old term, and "distributed" somewhere in the middle. There is another word, "multiprocessing," which technically means a multi-processor board, but has morphed into a more general sense. Anyway it would be a good department. Parallel/distributed/concurrent/multiprocessing is the future.

Luke Gorrie - Re: Parallel-Concurrent Programming Dept.  blueArrow
3/25/2004; 4:40:59 PM (reads: 296, responses: 0)
I wonder..

Suppose everyone suddenly had four CPUs. Which programs in /usr/bin would be worth rewriting to take advantage of them, and which ones wouldn't it make any appreciable difference to?

David B. Wildgoose - Re: Parallel-Concurrent Programming Dept.  blueArrow
3/25/2004; 11:55:21 PM (reads: 283, responses: 0)
The obvious choice would include the grep family, but actually I suspect that that would be rapidly I/O bound.

Powerful but computationally intense Compression/Decompression would be a gain, as the spare processing power could be used to boost effective network bandwidth.

Ultimately though, it is in the application space that the major wins will occur. Databases, Natural Language processing, and so on. There will probably be a lot of speculative work carried out and then discarded, all in the name of faster responses to user requests.

My main machine at home is a dual-processor box that I've had for some time. I like to use a motoring analogy. Its CPUs might not "rev" as highly (clock rates), but it has more "torque". I hope you see what I am getting at.

Luke Gorrie - Re: Parallel-Concurrent Programming Dept.  blueArrow
3/26/2004; 6:57:08 AM (reads: 256, responses: 0)
Grep is plenty fast enough for me. The only programs that catch my eye in /usr/bin as needing more speed are tex and bzip, and I've no idea how parallelisable those are.

I was going to say gcc, but then I realised it already gets parallelism by the "software tools" approach, i.e. by running in parallel (or distributed) with make -j <n>. At first thought I also considered MP3 and movie players, but then realised that they are not CPU-bound at all.

There are a tremendous set of programs there that already perform perfectly adequately. So how much difference will SMP-everywhere really make to programming practice, I wonder?

As for databases and natural language processing, I don't think they answer. People doing that in a heavy-duty way will already buy multiprocessor machines if they need them, so I don't think "SMP everywhere" will make a big difference. I think that relatively few of the physical computers in the world are running such applications in a CPU-bound way.

"Multiprocessing everywhere considered irrelevant?"

Peter Van Roy - Re: Parallel-Concurrent Programming Dept.  blueArrow
3/30/2004; 11:04:38 AM (reads: 215, responses: 0)
Mark Evans: "Concurrent" seems the new vogue, "parallel" the old term, and "distributed" somewhere in the middle.

Actually, these words have quite precise technical meanings. Maybe it's belaboring the obvious, but let me define them briefly. "Concurrent" is a language concept that has nothing to do with performance; two operations are concurrent if their order is unspecified. "Parallel" is an implementation concept; it refers to implementation on multiple processors in order to improve performance. Concurrency and parallelism are orthogonal concepts; it's possible to have one or the other or both or neither.

"Distributed" also refers to multiple processors, but in contrast to parallelism, the emphasis is on the new properties introduced because of geographic distribution. For example, communication performance, global synchronization, partial failure, security, and resource management.

There are two rather distinct communities that do distributed programming. The first is interested in distribution because of performance, e.g., they use clusters and do SETI@home. The second is interested in distribution because of collaboration, namely how to get independent entities to work together using a distributed system. The two communities typically interact very little. Some rapprochement can sometimes be seen, for example, the recent work on Grid architectures touches both communities.

Chris Rathman - Re: Parallel-Concurrent Programming Dept.  blueArrow
3/30/2004; 11:23:01 AM (reads: 209, responses: 0)
Under the category of parallel processing, I've been wondering whether anyone has tinkered with the idea of using a parallel processor architecture to do garbage collection?

Mark Evans - Re: Parallel-Concurrent Programming Dept.  blueArrow
3/30/2004; 2:30:12 PM (reads: 207, responses: 2)

Peter, nice diction would respect such precise meanings. Yet habit and practice have rendered them fungible. That was my point. They should have precise meanings, but don't. Such is life in software. So here is my take on current usage.

The term "multiprocessing" is better for what you call "parallel." One often hears of "parallel threads" executing on single processor platforms; hence common usage violates the proposed meaning of "parallel." The connotations of "multiprocessing" are better if one wishes to identify implementation on multiple processors.

We also have Beowolf clusters alternately labeled "parallel" or "distributed" depending on the speaker. Here "distributed" has no sense of geographic distribution. The computers all live in the same room. So I agree on the term "distributed" but qualify with this important exceptional case.

Most engineers would confuse "concurrent" with "parallel." The meaning proposed for "concurrent" seems narrow. There can be concurrent processes with or without indeterminate execution order. Consider a long task running simultaneously with a short task that may only commence after the longer one begins. The execution order is specified, but the tasks still run concurrently.

Darius Bacon - Re: Parallel-Concurrent Programming Dept.  blueArrow
3/31/2004; 3:57:06 AM (reads: 203, responses: 1)
For what it's worth, I understand those words as Peter does, and thought they were the usual meaning, given the usual amount of slop in how people use words.

Ehud Lamm - Re: Parallel-Concurrent Programming Dept.  blueArrow
3/31/2004; 4:07:05 AM (reads: 210, responses: 0)
Here's why I chose the name I did.

Peter Van Roy - Re: Parallel-Concurrent Programming Dept.  blueArrow
4/1/2004; 12:02:42 AM (reads: 182, responses: 0)
Mark Evans: Most engineers would confuse "concurrent" with "parallel."

The engineers who confuse the two are just that: confused. In computer science research the distinction between concurrency and parallelism is quite clear. For example, the CONCUR series of conferences is clearly about concurrency. The EUROPAR series of conferences is clearly about parallelism. Good books on distributed algorithms define concurrency exactly as I defined it.

Mark Evans - Re: Parallel-Concurrent Programming Dept.  blueArrow
4/2/2004; 3:46:42 PM (reads: 148, responses: 0)

Merely calling them confused does not help the situation; or for that matter, address the counterexamples. Equally respectable groups violate these definitions (e.g. by speaking of "parallel threads" on single processors). The point is not to stamp an Imprimatur on one definition and dismiss the rest, but to minimize conflict in a world of technical communication. A good approach is to remove ambiguity with terms to which everyone can subscribe. The term "multiprocessing" is clearly more precise and less ambiguous than "parallel" for conveying the meaning proposed (execution on multiple processors). I won't haggle over "concurrent" except to note that the Merriam-Webster dictionary of English offers "running in parallel" as one definition of that word. Have mercy!

Peter Van Roy - Re: Parallel-Concurrent Programming Dept.  blueArrow
4/4/2004; 8:31:28 AM (reads: 150, responses: 0)
. The point is not to stamp an Imprimatur on one definition and dismiss the rest, but to minimize conflict in a world of technical communication.

My way is twofold: (1) define all possibly misunderstood words before I use them and (2) use definitions that are used by the best thinkers in the domain.

A good approach is to remove ambiguity with terms to which everyone can subscribe.

The problem here is how to be sure that you use terms with which "everyone can subscribe". This is wildly impractical (who is "everyone", "everyone"'s use of the terms changes with time, etc.), and you will still have to define your terms before using them! I prefer to avoid this neverending headache.

Mark Evans - Re: Parallel-Concurrent Programming Dept.  blueArrow
4/8/2004; 3:57:02 PM (reads: 105, responses: 0)

These terms find uses in multiple domains, including the broader English language itself, from which they derive. There is no "the domain." That level of narrow specialization mandates acronyms.

Yes, we define our terms; but communication fails in spite of this precaution if we define red as blue, up as down, hardware as software. Any written material involves some concept of "everyone" and the language which "everyone" shares. Dictionaries codify such knowledge. It is noteworthy that dictionary words often enjoy multiple meanings. However I reject implications that (a) I am in conflict with the best thinkers in any domain, or that (b) you can identify or (c) understand them better than myself, or that (d) they think current terminology is optimal even in their own subdomain.