Lambda the Ultimate

inactiveTopic Concurrency-oriented programming
started 10/21/2003; 5:06:42 AM - last post 10/23/2003; 1:01:22 AM
Peter Van Roy - Concurrency-oriented programming  blueArrow
10/21/2003; 5:06:42 AM (reads: 1765, responses: 20)
Concurrency-oriented programming is a phrase invented by Joe Armstrong, the main designer of Erlang. Basically, we would like to write applications where the concurrency follows the concurrency of the problem. This should be easy (language support) and cheap (implementation support). As far as I know, there are only two languages that support COP: Oz and Erlang (I would love to be proven wrong here--please give me evidence for others, if they exist! I mean good implementations, not paper designs.)

The majority of existing languages are sequential; concurrency was added as an afterthought. This makes concurrent programming difficult for them. A typical example is Java: it has monitors (shared-state concurrency) and expensive threads. Two years ago, when I told the head of our department I wanted to teach concurrent programming in a second-year course, he exploded "That's impossible!".

The reaction of our department head is understandable: shared-state concurrency, the kind that Java has, is the hardest to program in. There are two other kinds of concurrency that are just as practical, but much easier to program: message-passing concurrency (asynchronous messages between sequential objects) and declarative concurrency (threads and dataflow synchronization added to functional programming).

The easiest is declarative concurrency (see chapter 4 of CTM). This seems to have been forgotten by almost everybody. Yet it is not new: the first article on it I found was by Gilles Kahn in 1974.

Declarative concurrency is so nice that I believe it should be the baseline execution model for functional programming. (Not strict or lazy evaluation, which are both sequential.) This leads to many good things, for example here are two. (1) All the usual functional building blocks become concurrency patterns. For example, Map is a broadcast that collects results, and FoldL is the heart of a concurrent object with internal state (it accumulates an internal state from a stream of messages). (2) I/O becomes very simple: program input is a stream and program output is a stream. This is a perfectly good solution to the problem of declarative I/O.

Ehud Lamm - Re: Concurrency-oriented programming  blueArrow
10/21/2003; 5:22:56 AM (reads: 1768, responses: 0)
Let me repeat my question: does this make sequential programming easier?

You argue that sequential languages make concurrency hard. I agree. You say COP solves this problem. That sounds fine.

But to change our dominant approach from seuqential to COP, I want to know how COP languages help solve the well known difficulties we have with sequential programming. To wit, building software systems is still something were are not very good at doing.

Patrick Logan - Re: Concurrency-oriented programming  blueArrow
10/21/2003; 5:32:58 AM (reads: 1762, responses: 1)
As far as I know, there are only two languages that support COP: Oz and Erlang

Would you consider implementations of Scheme (e.g. Gambit and I think PLT) that support processes on the scale and efficiency of Erlang to be COP?

Why or why not?

Peter Van Roy - Re: Concurrency-oriented programming  blueArrow
10/21/2003; 5:58:14 AM (reads: 1776, responses: 0)
I don't think COP will solve all problems with sequential programming, just some of them. Some of the difficulties of sequential programming arise because concurrency is difficult. E.g., dependencies between components or objects: truly independent components or objects must be concurrent! If your program is sequential, then ipso facto you have designed in a dependency between all its components.

It's not enough to have cheap concurrency (although it is an important first step). To be easy to use, concurrency must also be transparent. For example, in Oz I can use any list operation in a concurrent setting (since a stream is simply a list with unbound tail that can grow during execution). In Erlang, when I send a message to a process, I don't know whether the process is on my machine or across the network, and I don't care. Does this work in Gambit and PLT or do I have to write different code?

Jacob Matthews - Re: Concurrency-oriented programming  blueArrow
10/21/2003; 7:00:08 AM (reads: 1692, responses: 0)
Would you consider John Reppy's CML (http://cml.cs.uchicago.edu/) a bona-fide COP design? It introduces a very nice set of primitives to make programming concurrency easier, but the implementation is more focused on programming abstractions on a single computer than doing fancy network tricks like allowing processes to collaborate seamlessly across networks. That sounds to me more like a gap than a fundamental flaw (all you'd need to add your feature is a way to create a channel that talks over a network socket), but perhaps you've got a different perspective.

(FWIW, PLT has a concurrency library that introduces the same primitives CML does.)

David B. Wildgoose - Re: Concurrency-oriented programming  blueArrow
10/21/2003; 7:19:33 AM (reads: 1674, responses: 0)
Just because Concurrency is "hard", doesn't mean that it is going to go away. Quite the opposite. Intel, AMD and Sun have all announced that they will be adding more CPU cores to their individual chips. In the past couple of days, ARM has announced the same. This means that even mobile phones, PDAs and set-top boxes can all be expected to contain multiple CPU cores (albeit on a single chip) in the very near future.

This isn't really surprising. There are very real limits on how much work can be squeezed through the bottleneck of a single CPU.

Prior to discovering the concepts behind declarative concurrency, the best solution I had found was Eiffel's SCOOP. It's not so much a question of Should we write our code in a distributed fashion, rather, given the need How can we do so? I find it surprising personally that this is an issue that is still yet to be widely addressed by language designers.

Chris Rathman - Re: Concurrency-oriented programming  blueArrow
10/21/2003; 7:56:01 AM (reads: 1642, responses: 0)
Having become bogged down in relational database programming of late, I suppose my interests are a bit narrower at this point in time. The relational model is predicated on set operations, which theoretically provides a more declarative form of programming. Instead of specifying the sequence of events to take place, you declare the operations you want done on the set and let the RDMBS handle the specifics of how it wants to efficiently accomplish the task.

Wondering whether relational databases teaches us anything useful in terms of concurrency and declarative forms of programming? Or are sets too specific to model more general forms of concurrent languages?

Also, since Oz is heavily influenced by Prolog, I was wondering about your thoughts on the Mercury programming language. Mercury didn't really delve deeply into concurrency, but it is a nice marriage of logic and functional programming.

andrew cooke - Re: Concurrency-oriented programming  blueArrow
10/21/2003; 8:46:29 AM (reads: 1624, responses: 0)
How does Erlang bind variables to values? One of the neat things about Oz that makes COP easy is the use of logic variables (I hope I have the right name - they can be declared without having a value and can be bound only once; access to an unbound variable blocks and assignment to a bound variable is an error). This, as is probably obvious (it soon feels obvious, anyway, if you read the book), makes syncronisation between processes easy to describe in a declarative way.

[So if Erlang does something like this then I guess it would be nice for the Scheme implementations too; I'm pretty sure such things exist (they do for CL, iirc)]

Sjoerd Visscher - Re: Concurrency-oriented programming  blueArrow
10/21/2003; 9:13:27 AM (reads: 1586, responses: 1)
Maybe spreadsheats qualify? Certainly seems declarative concurrency to me.

Peter Van Roy - Re: Concurrency-oriented programming  blueArrow
10/21/2003; 9:27:53 AM (reads: 1574, responses: 0)
Regarding whether or not CML is a COP: it's a matter of degree. Is concurrency so easy and cheap to use that programmers use it without hesitation? How much extra code does a programmer have to write to use concurrency (ideally, none)? I don't have programming experience in CML, so I defer to those who do to answer these questions.

Regarding Mercury: it is much closer to the traditional Prolog style than Oz. Mercury keeps the Horn clause syntax. Mercury adds higher-order functions, static typing, and modes to Prolog. Mercury has no specific support for concurrency (it still supports global backtracking, AFAIK). Mercury in its original version jettisoned logic variables and constraints (they are coming back, though). Personally, I find that Mercury is too strongly attached to the Prolog tradition--many interesting language ideas have not been tried out in Mercury because of that.

Erlang does not use dataflow variables (which are actually logic variables in Oz). It synchronizes using the 'receive' statement, which takes a message out of a process mailbox using pattern matching. Erlang has no specific support for declarative programming, but it could be programmed in a 'declarative style': without race conditions. This is not quite as elegant as Oz, e.g., using functional building blocks as concurrency patterns is not possible in Erlang.

Luke Gorrie - Re: Concurrency-oriented programming  blueArrow
10/21/2003; 9:30:33 AM (reads: 1565, responses: 0)
Erlang does Message-Passing Concurrency, which is also covered in the book.

Neel Krishnaswami - Re: Concurrency-oriented programming  blueArrow
10/21/2003; 12:35:31 PM (reads: 1474, responses: 1)
The biggest difference between CML and Erlang's concurrency models is that CML uses synchronous message passing where Erlang uses asynchronous message passing. When a CML thread sends a message, it blocks until someone picks it up.

Unlike Erlang threads, CML threads aren't born with a message queue -- instead, channels (message queues) are first class values in CML and a thread can be handed a set of channels it can listen on when it is created. (Or if you're really hard core you can create a channel over which channels can be sent as values.) Note that you never directly send a thread a message in CML: you always send a message into a channel, where some potentially unknown other thread can pick it up.

Dually, sends and receives (and other synchronous operations like nondeterministic choice) are also first-class values in CML; you can compose them to build fairly complex protocols and then treat them as single conceptual entities.

Luke Gorrie - Re: Concurrency-oriented programming  blueArrow
10/21/2003; 1:28:27 PM (reads: 1441, responses: 1)
The first-class receives ("higher order concurrency") look great. I gotta hack me some CML.

Ehud Lamm - Re: Concurrency-oriented programming  blueArrow
10/21/2003; 1:41:56 PM (reads: 1449, responses: 0)
Post the example code when you have it...

todd - Re: Concurrency-oriented programming  blueArrow
10/21/2003; 2:20:24 PM (reads: 1415, responses: 0)
With threads assigned to channels how do you prevent priority inversion and deadlock?

Jacob Matthews - Re: Concurrency-oriented programming  blueArrow
10/21/2003; 4:25:25 PM (reads: 1397, responses: 0)
Neel -

That's not quite true. CML supports both synchronous and asynchronous messages natively - channels create synchronous communication and mailboxes create asynchronous communication. They have similar interfaces, but a CML.send (which works on a channel) blocks until somebody actually reads the message whereas Mailbox.send won't ever block.

You can implement mailboxes with channels but not the other way around (more generally, you can't get n-way rendezvous if all you have is (n-1)-way rendezvous), so if you've got to pick one channels are your best bet. But having both is awfully convenient. Besides, if you really want to, you can give yourself as big a headache as you'd like by creating channels over mailboxes, mailboxes over channels, channels that send channels that send mailboxes that receive channels, syncvars that hold mailboxes of channels, and so on.

By the way, hope you're enjoying CMU. Too bad we couldn't lure you over to U of C ... :)

Patrick Logan - Re: Concurrency-oriented programming  blueArrow
10/21/2003; 7:00:45 PM (reads: 1358, responses: 0)
In Erlang, when I send a message to a process, I don't know whether the process is on my machine or across the network, and I don't care. Does this work in Gambit and PLT or do I have to write different code?

Gambit and PLT are Scheme. You get to build whatever you want, as long as the base has scalable lightweight threads.

That's the beauty of it, as they say.

Peter Van Roy - Re: Concurrency-oriented programming  blueArrow
10/21/2003; 11:15:26 PM (reads: 1337, responses: 0)
Spreadsheets are a very interesting example. They're actually not declarative concurrent because of their user interface: the user *picks* any cell to update. This picking is a source of nondeterminism. So there is an observable nondeterminism and the user is the source of it! (Another way to say it is: if the user and the spreadsheet program agree on the order in which the user will update the cells, then it is declarative concurrent.)

In the terminology of CTM, a spreadsheet would look like this: each operation is a port object and each spreadsheet cell is a port object. The operations and spreadsheet cells are connected in a bipartite graph. Updating a spreadsheet cell sends a message to all the operations taking input from that spreadsheet cell.

Exercise for the reader: implement this :-) !

Kimberley Burchett - Re: Concurrency-oriented programming  blueArrow
10/22/2003; 6:33:41 PM (reads: 1185, responses: 0)
You said that for objects to be independent, they have to be concurrent. I've been developing a similar feeling about lazy evaluation -- it decreases coupling in ways that I have yet to fully appreciate. However it's not clear to me what advantages true concurrency has over and above mere laziness. Would you elaborate?

Mark Evans - Re: Concurrency-oriented programming  blueArrow
10/22/2003; 9:51:16 PM (reads: 1167, responses: 0)

I would love to be proven wrong here--please give me evidence for others, if they exist! I mean good implementations, not paper designs.

Chasing some previous remarks I'll float LabVIEW. Certain terms in the Oz literature ring LabVIEW bells in my head. When I hear the term "data flow," LabVIEW's graphical wires come to mind. LabVIEW gives automatic concurrency - if no wires form dependencies between computational nodes, the nodes execute concurrently. It's concurrency for free. LabVIEW is a very professional, mature, compiled, commercial language, with a user base in the low millions, but seemingly little known in academic CS circles. That's a shame because it might be an ideal tool for introductory education. One of the problems is that NI markets the language not as the wonderful, general-purpose software development tool it really is, but as a vehicle to sell specialized NI hardware in the niche DAQ market. I have it from an employee that NI makes money on DAQ boards, not software.

Peter Van Roy - Re: Concurrency-oriented programming  blueArrow
10/23/2003; 1:01:22 AM (reads: 1171, responses: 0)
You said that for objects to be independent, they have to be concurrent. I've been developing a similar feeling about lazy evaluation -- it decreases coupling in ways that I have yet to fully appreciate. However it's not clear to me what advantages true concurrency has over and above mere laziness. Would you elaborate?

Lazy evaluation between a producer and a consumer does coroutining. There is a kind of lockstep synchronization between producer and consumer. Whenever the consumer needs something, it passes control to the producer and waits. When the producer has produced it, it passes control back to the consumer, which continues where it left off.

If you want to build a bounded buffer between the producer and consumer, then you need to give the producer some slack. That is, if the buffer is of size n, then the producer can be anywhere from 0 elements ahead of the consumer to n elements ahead. With lazy evaluation, it's not possible to have this slack. To get it, you need concurrency. Chapter 4 in CTM gives code that shows how this works. BTW, the simplest implementation of a bounded buffer uses *both* lazy evaluation and concurrency.

Lazy evaluation is more an issue of resource management. For example, let's say you want to read and process a very big file, which may be bigger than the memory size of your program. If the file is read lazily (according to the need of the consumer), then this can be done even if your system has very little memory (as long as the consumer only needs to look at a small part of the file at a time!). For this use of lazy evaluation, you don't need concurrency.

This only scratches the surface of the relationship between laziness and concurrency! For more information, see chapter 4 in CTM.