Distributed Objects vs. Messaging

Oz supports distributed objects, JINI can move objects. But Erlang generally speaking moves only data around, and in the Web Services debate there are folks who say that data-centric is the best way to go.

As with all design issues there are probably several dimensions involved in deciding what approach to take; I'm curious to hear what people believe those are (the web services article lists "evolution, intermediaries, and recoverable operations") and when the tipping point shows up in favor of one approach or the other.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Why stick to only one?

The web-calculus allows both objects and messages; data via Exportable objects (essentially immutable structs/records), and by default, marshal-by-ref objects which reside on the server.

Re: web calculus

Neat, I recall looking at that before and bookmarking it :-) To ask a question which I will try to figure out over time by perusing the net, is it a lot along the lines of E?

[as a complete and utter aside, I liked reading a quote from somebody famous, I think via the Cal docs, that we aren't so much doing calculus as we are doing algebra.]

Note that even if there is a theory which attempts to give us the power of either approach, we still need to learn at least rules of thumb for when a given approach is best to use. Is that something touched upon in the web-calculus?

I'm the kind of person who doesn't pick up theory too quickly so I'd really love to find theory matched up with practical examples / real world experiences.

To ask a question which I

To ask a question which I will try to figure out over time by perusing the net, is it a lot along the lines of E?

Yes, but not "full" E. See Mark Miller's post on the topic. He also provides a link to my port of the waterken server to C#/ASP.NET. That whole thread is actually very interesting. The web-calculus is focused on language-neutral distributed object computation, and is tied closely to http and web services at the moment.

Note that even if there is a theory which attempts to give us the power of either approach, we still need to learn at least rules of thumb for when a given approach is best to use. Is that something touched upon in the web-calculus?

Well, there's the usual "stateful vs. stateless" difference between data and objects. Stateless is always preferable if it maps naturally to the solution (at least for web apps), since it's more scalable given the nature of the web (see: Google).

However, as continuation-based web programming demonstrates, state is sometimes necessary and natural, and the web-calculus provides such continuations naturally, since browsing a web-calculus site is simply traversing the object graph of your application.

You mention below flow-based programming. Related to E, Oz, concurrency, messaging, etc. is dataflow and functional reactive programming (FRP). FRP is something I'm exploring now, to see how it compares with E's event loop approach and Erlang's Actors.

Re: FRP

I'm totally into learning things like FRP if only because a lot of the examples are about developing video games :-) Well, that and the fact that it seems like shared-mutable-state-concurrency is assumed debunked in places like LtU, but not in the "real world" more is the pity. So I'd like to understand alternatives if I ever get the chance to advertise / recommend / use them.

The linked flow page I thought was interesting because it does a necessary, if not entirely sufficient, job of doing a "compare and contrast", helping me understand subtleties in the different-but-seemingly-very-similar 'cognates'. How does Timber differ from Erlang from CSP from Occam from Scala Actors from FBP from Linda from Termite from C-omega from GdH from etc.? Is there something to unify them? Should we want to unify them? Or, what are the core things determined via use and experience to be good vs. not?

I think I'm possibly drifting.

To get an idea of how

To get an idea of how complicated all this is, see if you can understand this . It is also interesting how much discussion there is on the discussion tab.

I am also impressed by the fact that there is no discussion at all of information theory in this debate. In communications theory errors are expected and this is dealt with by coding. Communications can be made as reliable as we like by coding.

A separate concern?

Most application-level protocols, other than those applications with unusual QoS requirements (streaming media, especially the full-duplex variety, where predictable throughput and/or latency is more important than error-free transmission), leave things like error detection/correction and connnection management to lower-level protocols. Non-delivery of messages is still an issue which the application level has to deal with, but it's often taken for granted that the bytes that I write into one end of a socket will come out the same at the other end--assuming they come out at all.

What kind of unreliability?

Yes, But reliability is mentioned several times in the above links. What is the nature of this unreliability? Is it the processes themselves? If so, information theory may be applied on a higher level, to the system as a whole. Processes viewed as automata have a language, in information theory this is a code. When a process works we say that it parses its messages and all is well. If it doesn't work and ends in an error state then that error state could be used to correct the problem. Generally this is not done. Errors cause termination and that is the end of that. Even when many errors are provided as in the Socket system they are rarely even printed out.

I have my own ideas about why this is the case. Basically the languages we use such as C++ are too low level to easily handle the problem. Agent (or Actor) languages are much higher level. Pick your own favorite.

Edit: Any discussion of reliability (ie errors) is an information theory problem. If we find that the errors are only mistakes or accidents then we can fix the problem. When there are no more errors we can dispense with the information theory.

Re: own theory on error handling (lack thereof)

Also, I hazard to guess that it is hard to get the modeling of the error system right. People have a hard enough time specifying what they want in some fashion that works from initial idea all the way through to a long-lived successful system. Trying to get the error model right is presumably even harder because you are by definition dealing with the if not infinite at least possibly large unknown.

One can of course try to partition that with e.g. types of exceptions, but if you get that modeling wrong (e.g. not granular enough in some case) then your error handling might think it is taking the correct corrective action, but in fact isn't.

So an underlying [OTing] question now in my head is, what are good (based on theory, tempered by real experience) approaches to defining how to model and thus handle errors? Exceptions a la Java are nice and all but I don't think there's a currently winning single coherent theory or praxis around how to design and use them.

Handling errors the computer

Handling errors the computer scientist way: partial or full correctness + fault tolerant computing for hardware errors.

Being pedantic for a moment,

is there really such a thing as complete correctness? I suspect that something so labeled gets away with it because of some semantic slight-of-hand. Or, put the other way around, something claimed to be complete could be shown to be incomplete by extending the event horizon a smidgen?

You need to ask...

..."correct with respect to what?"

We can definitely prove that some programs are correct with respect to a formal specification. This is "just" math, and after you do the proof we have as much confidence that the program is correct with respect to the specification as we do in any other mathematical theorem.

However, this should naturally lead you to wonder "but how do I know the specification is correct with respect to my intention?" And equally obviously, this is not formalizable, because a human being's intent is not a formal thing. We can have a conversation to figure out some properties that we want to hold of the spec, and formalize those properties, but there is no -- and cannot be any -- guarantee that we will end up with a complete set.

A ha-ha-only-serious way of putting this is to say that the purpose of programming language research is to bring us to a world where all bugs arise from conceptual errors, and no bugs arise from implementation errors.

Or in the terminology of Brooks

To completely eliminate the incidence (accidents) from programming, and exclusively deal with the essense.

Of course, an ideal proof system for programming languages will go beyond eliminating the accidents, and also inform the user when the essence that he/she requests, is utter bollocks. :) Or some Turing-friendly approximation thereof.

Re: eliminate the incidence

[Drifting ever more off-super-parent-topic but I'd be happy to learn about this] What is the most concrete system for that today? Praxis ADA, or Coq, or something-plus-something (specs-plus-implementation)? Are there any concrete systems that are "lightweight" (ha ha)?

I recall [edit: found (home page) (before on LtU)] a specification language that was a restricted natural language (English) which could be automatically turned into code / tests - it sounded like a very interesting possible sweet spot (vs. making the Product Manager learn Z'ed').

Not so difficult

So an underlying [OTing] question now in my head is, what are good (based on theory, tempered by real experience) approaches to defining how to model and thus handle errors?

It really isn't all that hard. Here is a list of socket errors generated by the winsock system. I think you can see that they are all self explanatory and suggest a solution. Since you seem to be interested in ready to go practical answers you might not be interested in this, but there is a straight forward way to use these error messages to correct the problem. First of all the information content must be represented. Description logic would be a good way to start. We can now simply query the knowledge base along with other situational factors for a new goal or next step. This process has a good chance of solving the problem. If it doesn't work (ie another error) we can simply use it as a "learning" experience (ie machine learning) and send out for a pizza (with sausage and mushrooms).

Re: pizza

I kinda like the ANSI standard more :-)

Some error taxonomies

Via Artima.

More errors

I would expand the failure category into at least three types: Timing, memory, and other random events. Timing and memory errors often appear as mysterious random errors, but are completely avoidable. In fact many apparent memory errors might actually be timing errors.

Re: complicated

Re: Actor model: I'm familiar with the ideas at a high level (having e.g. looked at Erlang, Timber, etc.), but not the nuances - I'll read through the discussion there. Also along those lines is flow based programming.

Re: Theory: While I am not great at learning theory, I definitely would rather use a system that is based on it - all those hackish languages and designs that could be much simpler or more powerful if only the designer knew the relevant theory drive me a little nuts. Note that this also goes for History (a la doomed to reinvent Lisp).

(Presumably the theory around this topic might want to take into consideration the heterogeneous nature of the current internet, and try to play well with systems which were not themselves designed with good theory in mind?)

loosely coupled components

in my experience the major advantage of data-driven design is that it encourages more loosely coupled components.

the article just slightly touches this in Item 18: Prefer context-complete communications styles. The OOP-centric design approach of remote interfaces tends to be chattier on the network than sending context-complete document messages.

what this implies is that data-driven architecture enforces a certain autonomy in the system components. this is a very desirable constraint during system evolution, especially where providing resilience, or architectural scalability, matter.

interestingly enough, the "data driven" vs "model driven" issue has been iterated through many times before the recent incarnation of distributed objects. host-based networking vs distributed operating system, HTTP transport vs application protocols, messaging vs RPC - a certain trend is already historically documented. :)

In my experience the major

In my experience the major advantage of data-driven design is that it encourages more loosely coupled components.

Well, it encourages coupling components to particular data formats and protocols. Calling this "loose coupling" is an euphemism. I could provide examples about this kind of coupling from my daily practice and those can be served as counterexamples and anti-pattern.

I would like to de-center all this X-centric software ideologies with the assertion that heterogenous systems-of-systems are the norm and not the exception and functionality that couples / translates data and objects and being separated in an own layer foremeost enables components to be components ( like a membrane enables a cell to be an open and autonomous system at the same time ). This is not anymore revolutionary than ORM but a shift in the accent.

Well, it encourages coupling

Well, it encourages coupling components to particular data formats and protocols. Calling this "loose coupling" is an euphemism.

i see a valid euphemism (here, did it again) in differentiating coupling as a quality measure from a system architectural point of view.

coupling components through protocols tends to yield a different component design, as compared to coupling components through interwoven logics.

i absolutely agree that the point is kind of moot, as neither of the x-centric approaches forces you to do things in the wrong way.

i do not agree though that this shift in the accent is valid, contrary to peripheral technology like ORM - component design is the central engineering aspect of software development, and in my opinion and experience stronger emphasis on component borders in language and design is a good thing.

i am with the agile crowd here: sometimes, pain is good for you.