Branching Time vs. Linear Time: Semantical Perspective

Sumit Nain and Moshe Vardi, Branching Time vs. Linear Time: Semantical Perspective, invited ATVA'07 paper.

...this paper puts forward an, admittedly provocative, thesis, which is that process-equivalence theory allowed itself to wander in the “wilderness” for lack of accepted guiding principles. The obvious definition of contextual equivalence was not scrupulously adhered to, and the underspecificity of the formalisms proposed led to too many interpretations of equivalence.

In revisiting the notion of process equivalence, which is a fairly central part of concurrency theory, Nain and Vardi end up arguing in favor of a purely trace-based notion of equivalence and the use of linear-time logics. This in turn leads to a rejection of bisimulation as a tool for establishing process equivalences:

The gist of our argument is that branching-time-based notions of process equivalence are not reasonable notions of process equivalence, as they distinguish between processes that are not contextually distinguishable. In contrast, the linear-time view does yield an appropriate notion of contextual equivalence.
...

In spite of its mathematical elegance and ubiquity in logic, bisimulation is not a reasonable notion of process equivalence, as it makes distinctions that cannot be observed.

They take pains to point out that they are not claiming that bisimulation or CTL should be abandoned or are not useful. Rather their emphasis is on the fact that bisimulation is not a contextual equivalence and is therefore not appropriate for establishing equivalence between (for example) a specification and its implementation. As they say in the conclusion of the paper:

While one may not realistically expect a single paper to overwrite about 30 years of research, a more modest hope would be for this paper to stimulate a lively discussion on the basic principles of process-equivalence theory.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Refusal Semantics

Having briefly skimmed the paper, it appears that CSP's refusals-based semantic models are exactly what the authors seek. Indeed, CSP's notions of process equivalence were (AFAIK) explicitly designed to have the kinds of properties the authors mention as desirable in a notion of process equivalence.

It is amazing that this paper was written without reference to these models.

Also of relevance is a recent paper by Lowe that shows how to express bounded LTL formulas in terms of CSP process refinements in CSP's "refusal traces" model, which augments standard traces semantics with refusal information.

Not quite equivalent

My initial reaction to this paper was much the same as yours: "sounds like CSP's process equivalences". However, on further reflection it seems to me that there is a (slight) difference. CSP's refusal-based semantics do seek to address the same issue (falling into the category of "decorated traces", which the authors do mention), but do so by defining the decorations within the semantic model and providing multiple models that define different equivalences. In contrast, I think Nain and Vardi are arguing that there should be a single model of equivalence (traces) and that decorations should be left to the modeler (who presumably has the most direct knowledge of exactly what decorations make sense for given application). Another way of viewing this is that the modeler should be required to explicitly define what is observable about the system. The authors freely admit that there's some debate about whether such an approach would be too cumbersome in practice. Their response appears to be that (a) details that are important to the application should be specified by the modeler, and (b) that common observables (e.g. deadlock) could be built into the modeling semantics.

I agree that it would have been nice to see the CSP denotational semantics directly discussed. However, the authors do at least refer to Stephen Brookes' work on trace-based equivalences. That paper does discuss the relationship between failures, failures-divergences, and Brookes' proposed trace-based equivalences (hardly surprising, since Brookes, along with Bill Roscoe, did much of the early work on establishing the denotational semantics of CSP).

Personally, I've found that CSP's existing semantic models have worked quite well for me in the past. But I can see some merit in the argument for a single definition of process equivalence.

Recondite arguments

I wish they wouldn't have included the CSP examples. It seems they are discussing the relative merits of process modeling languages, not the expressiveness of linear time logic and branching time logic. (I don't accept their claim on receptiveness and deadlock, an algorithm should be expressed as directly as possible.) A restriction to Buechi automata should have been enough. Their argument becomes even worse by introducing a third model, transducers, which I think adds little to the argument.

Their claims on bisimulation also makes me wonder. What kind of bisimulation are they talking about? As far as I remember, bisimulation1 is a technique to prove trace equivalence, or inclusion, between two systems, it need not imply structural equivalence as they claim.

Sorry, but I find this a paper which runs of in a thousand different directions without ever really touching the heart of the discussion, whether a very technical logic like LTL should be preferred above another very technical logic like CTL - and even then I'd preferred if they would have discussed various LTL and CTL versions with respect to real-world examples.

1 What about refinements?

CSP Examples

The CSP examples are misleading and also not valid CSP code.

The natural expression of their first example in CSP is

(a?x -> h!x -> STOP) |~| (b?x -> h!x -> STOP)

where |~| represents CSP's internal choice operator.

This is distinguished syntactically from the second example:

(a?x -> h!x -> STOP) [] (b?x -> h!x -> STOP)

and is distinguished in all process equivalences except the traces model.

FDR != CSP

Although the FDR ASCII syntax for CSP is pretty widely used these days, some authors do use other syntax. In this case, Nain and Vardi appear to be using a syntax similar to the one used by Stephen Brookes in his recent publications. Which I guess isn't surprising, since the basic approach they are arguing for seems somewhat inspired by Brookes' work on trace-based equivalences.

You are quite correct that the two processes (using either the Brookes syntax or the FDR syntax) are distinguished in models other than the traces model. However, those models require that the traces be annotated at each step with additional information (refusals) that is not observable, whereas the authors are arguing that given a more explicit process definition the processes could be distinguished on the basis of traces alone.

Language and logic

It seems they are discussing the relative merits of process modeling languages, not the expressiveness of linear time logic and branching time logic... Their argument becomes even worse by introducing a third model, transducers, which I think adds little to the argument.

I think the reason they get into modeling languages is because part of their argument hinges on the idea that given a language that follows the principles they espouse trace-equivalence will be sufficient, and the arguments about the relative expressiveness of LTL and CTL go away. At least, that was how I interpreted the paper. I agree that the discussion of transducers felt like a bit of a digression - it was interesting in a way, but occupied a lot of space without really doing much to drive home their point. The introduction of Prompt-LTL also kind of came out of nowhere.

As far as I remember, bisimulation is a technique to prove trace equivalence, or inclusion, between two systems, it need not imply structural equivalence as they claim.

My understanding is that bisimulation can be used to prove trace equivalence (since bisimilar systems must be trace-equivalent), but does so by requiring a certain amount of structural similarity between the LTSes of the two systems (i.e. establishing equivalence requires knowledge of things not directly observable about either system). The process descriptions of the two systems need not be structurally similar (e.g. one might be a parallel composition, one might be purely sequential), but the transition-systems those processes define will be similar in ways that aren't strictly required for contextual equivalence. I could of course be completely wrong about this, since most of my process calculus experience is with CSP, which uses other forms of process equivalence than bisimulation.

Which Traces Though?

Of course it is well-known in the process community that bisimulation is usually too fine to be contextual. As Marco states above, bisimulations are a proof-technique to show that two processes are contextually equivalent (relative to some other, coarser notion of equivalence like traces). As a proof technique bisimulations are especially nice because they are closed under reductions. Traces are not closed under reductions. This is what makes traces inconvenient in may situations.

Moreover, and more fundamentally, traces are often also too fine in the sense of distinguishing contextually indistinguishable processes. For example the processes !x.?x and 0 in an asynchronous CCS-like calculus (with !x.P being input on channel x, and ?x being output on x.) are contextually indistinguishable, yet not trace equivalent.

This is an instance of the general problem of finding labelled transitions for a reduction-based calculus such that the the two semantics coincide. This is a highly non-trivial problem that's far from being solved.

The meaning of equivalence

As I mentioned in the original post, the authors aren't arguing that bisimulation isn't a useful proof technique. Quoting the paper (page 5):

One could argue that bisimulation equivalence is not only a mathematically elegant concept; it also serves as the basis for useful sound proof techniques for establishing process equivalence, cf. [39]. The argument here, however, is not against bisimulation as a useful mathematical concept; such usefulness ought to be evaluated on its own merits, cf. [31]. Rather, the argument is against viewing bisimulation-based notions of equivalence as reasonable notions of process equivalence.

Nain and Vardi contend that because bisimulation defines a structural similarity rather than a purely observational similarity it's the wrong relation to be used to define process (read behavior) equivalence (as opposed to, say, transition-system equivalence). In a sense, I guess the argument boils down to what you think "process equivalence" should mean (philosophically rather than mathematically).

I'm afraid I'm unable to see how !x.?x and 0 are contextually indistinguishable (I don't have much experience with asynchronous process calculi). I'm guessing that !x and 0 are indistinguishable in the sense that neither causes an attempt to output on x to block (since communication is asynchronous), but I'm not sure why the ?x part of the first process would not have some effect on at least some contexts (for example a context that accepts an input on x) that 0 could not produce. Could you please explain why the two process cannot be distinguished contextually.

Starting to see the argument

Ah, I am used to thinking in terms of trace inclusion, not CSP bisimulation proofs on process equivalence - which I have little knowledge of. Should I read it as a paper which tries to convince others [used to CSP] to think of process equivalence in terms of trace inclusion like Lynch/Vaandrager?

[I reread your comment above; I guess that is a yes. I just read it with the wrong eyes knowing more of automata theory than process calculi.]

Not including CSP

Not quite. Many process calculi (most prominently CCS and the pi-calculus) use bisimulation as a proof technique to establish an equivalence between the labelled transition systems that provide the underlying semantics of their process model. It's not the same as trace inclusion, and in fact provides a stronger criterion for process "equivalence" (processes may have identical trace sets, but not be bisimilar).

CSP traditionally uses something more like trace inclusion to define process equivalence (although bisimulation is used within the FDR2 model-checker to perform state-space compression). The original CSP denotational semantics was based on traces alone. Later, Brookes, Roscoe, and Hoare added the notion of "refusals" (decorations on a trace that indicate which events/symbols can be refused at each step) to enable them to deal with issues like the one used as a running example in the Nain/Vardi paper. That approach allows for stronger process equivalence criteria (finer distinctions), but at the price of relying on non-observable information about a process (how can an observer know what a process will or will not refuse to do?). More recently, Brookes has been doing some work on incorporating the kind of information presently captured in refusals directly into the structure of traces. That idea seems to be the inspiration for Nain and Vardi.

The Lynch/Vaandrager approach to process equivalence by trace-inclusion is, I think, very close to what Nain and Vardi are promoting. It's not clear to me yet exactly what the differences (if any) actually are.

[Edit: as Toby quite rightly points out below, my statement that refusals are "non-observable" is incorrect. A better way to phrase it might have been to say that from a testing point of view observing refusals requires attempting interactions with a process, but in the current appproaches to CSP semantics those attempted interactions aren't (directly) included in the traces -- which is to say the records of observations -- of a tested process.]

Observable Refusals

I would argue that refusals are perfectly observable by a process's environment.

Say you're the environment and I'm the process. You watch me execute to a point where I've performed some trace s. You then offer me a set of events X. If we deadlock, you know that I can refuse X after s -- i.e. that (s,X) is a failure of mine. Otherwise, I might perform an event e from X. I've now performed a trace s ^ and we then repeat the above process. I might also diverge after performing s, but let's ignore this for now.

This is the basis, is it not, behind the stable-failures semantics?

Arguing that only traces are observable seems problematic to me. Can you elaborate?

Observing my failure to articulate

"Observable" may not have been quite the right term to use. To be honest, I'm still grappling with understanding this myself, since the way I'm used to understanding processes is in terms of CSP semantics. The best I can articulate it at this point is that by offering up a set of events X, and then noting a refusal, we are (in a way) interacting with the process, yet not recording the result of that interaction in the trace of interactions. Instead, we're treating those interactions as somehow "special", and including them in a separate decoration that records (in effect) what a process might do next. In contrast, Brookes seems to be arguing for an approach (e.g. here or here) in which refusals are recorded directly in the trace as idling actions. The result appears to be something similar the failure-traces described by van Glabbeek in The Linear Time - Branching Time Spectrum I.

I suppose it's debatable how much practical difference such a change would make. Failures-trace semantics does subsume failures semantics, and Brookes claims that it provides a more general semantic model (i.e. applicable beyond standard rendezvous-communication CSP). Nain and Vardi seem to be proposing a similar idea, although with the additional requirement that processes be receptive. Their main concern is to settle on contextual equivalence as the correct way to define process equivalence. Stable-failures or failures-divergences presumably pass that test. However, the Nain/Vardi paper argues against such "decorated" traces on the grounds that defining the decorations as part of the concurrency theory leads to a proliferation of different equivalences, while they are seeking a single definition of equivalence. Quoting from their paper:

In our approach, only the modelers know what the relevant aspects of behavior are in their applications and only they can decorate traces appropriately, which led to our Principles of Comprehensive Modeling and Observable I/O.

So, to recap, "non-observable" was a poor choice of words on my part. You are quite correct that refusals constitute an "observation" of the process. The issue is really whether certain "observables" should be given special treatment, or should be included directly in the structure of the traces along with all other observations. CSP seems to work quite well with the former approach, while Nain and Vardi seem to be arguing for the latter because they believe it will lead to a more uniform and "right" definition of process equivalence.

Buy Lynch?

"Observable" may not have been quite the right term to use. To be honest, I'm still grappling with understanding this myself, since the way I'm used to understanding processes is in terms of CSP semantics.

I have the inverse problem. IO automata are input-enabled, they cannot refuse input. For that reason, I never really liked CSP-like modelling of processes, and never really dove very deep into them; it feels somewhat strange to model processes which will dead-lock on not being able to process input. Although, I see it could be natural in a process description, why be able to handle input which shouldn't occur at some time in a protocol?

However, it seems to me that that this is actually a specification problem, and dives somewhat into the Software Engineering of protocols. I would argue that it is reasonable to expect input-enabledness from processes since one wants to explicitly define how to handle failures in a protocol, and not leave that underspecified.

Are all these approaches not a stop-gap to remedy an SE problem in CSP?

(Though my opinion on IO automata is that they exhibit the opposite problem to CSP: I feel they are a low-level approach to modelling protocols, and rather should be used as the underlying semantics of a process modelling language. Not the reverse.)

{Snapped a ridiculous comment. The problem is how to handle branching in receptive/non-receptive models where trace equivalence is the preferred manner of defining equivalence on processes.}

I see that contextual equivalence, which I think is understood as behavioural equivalence w.r.t. another formula or system, might be a good notion; i.e. processes should be equivalent w.r.t. to well-behaved partner processes. But is that really a problem? I think that is usually trivially modelled in most CTL formulae which try to establish certain qualities of processes. I.e. they often have the form: P => Q, where P is a restriction on the behaviour of the system, and Q is the quality one would like to verify.

I suspect the IO automata model may actually already solve a lot of problems as stated by the authors, but I admit I should really read their paper better. My knowledge of this field is rather "stale" at the moment.

[ I assume there is no fundamental difference between: S |- P => Q, a system S under restriction P implies quality Q, and (S|T) |- Q, a system S when composed with T implies quality Q ]

[ I scanned the van Glabbeek paper, most differences between trace based and branching based logics do disappear when assuming input-enabledness - so ok, I finally see the argument. ]

[ Most systems branch, so I would argue that branching time logics are the way to go. *enough editing ;-)* ]

Receptive

I would argue that it is reasonable to expect input-enabledness from processes since one wants to explicitly define how to handle failures in a protocol, and not leave that underspecified.

Indeed. That's precisely what Nain and Vardi are arguing as well. Of course, there's nothing to prevent you from defining processes that are always receptive using CSP. The difference is that CSP also allows you to define non-receptive processes, which can be a really useful abstraction technique. However, it does make certain things implicit that Nain and Vardi are arguing should actually be explicit.

Also, despite the focus on CSP in Nain and Vardi's paper, I think it's important to note that CSP semantics is in fact based on notions of contextual equivalence. The approach is not quite the one advocated by Nain and Vardi, but is simialr in many ways.

I suspect the IO automata model may actually already solve a lot of problems as stated by the authors...

I suspect that you may be right if we're talking about modeling languages. But I think that their message was more about the definition of equivalence (regardless of the specific modelling language used), and about the logic most appropriate for reasoning about such equivalences.

Re: Buy Lynch?

Hello Marco!

I would argue that it is reasonable to expect input-enabledness
from processes since one wants to explicitly define how to handle
failures in a protocol, and not leave that underspecified.

That's like saying "I would argue that it is reasonable to expect all integers to be even"!

Process calculi model many different computational situations, and in some of them input enabledness makes sense, in others it does not. The point of process theory is not really to prescribe how the world has to be but to model it and to find connections and commonalities between different models.

That said, input-enabledness is often the right choice and most calculi like pi or CCS or CPS have coherent input-enabled fragments which one can extract via types.

Enabledness

Is it not also a question on what semantics to prescribe to CCS specifications? I.e. input-enabledness for an IO automaton implies that it will always eat the input, that does not necessarily imply that it will perform any interesting work (it may stay in the same state).

In that sense, underspecification for an IO automaton means that it will just ignore the input, this is different than not accepting the input.

From an SE point, I think "ignore" might be preferred to a "halt" semantics. Actually, that is always how I understood the difference between IO automata and CCS/CSP models. IO automata somewhat equals an approach: "Let's choose/take the model such that we can verify most industrial examples," and CCS/CSP models: "Let's find the academic model with the nicest properties. (In the hope at one point we will end up with the best model.)"

To some extend, I guess the root's paper might be read as a "convergence" paper, at some point these two wishes should meet.

What is your interpretation [of the paper]?

[It's been years since I read Nancy Lynch's book, though.]

Different choices (for models and modelers)

The section of Lynch's book (I'm assuming we're talking about "Distributed Algorithms" here) that introduces I/O automata explicitly acknowledges that some people may consider input-enabledness too strong a restriction for a general model. However, she goes on to point out that there are other ways to model systems that expect certain inputs only at certain times, and asserts that there are two major advantages of having the input-enabled property:

First, a serious source of errors in the development of system components is the failure to specify what the component does in the face of unexpected inputs. Using a model that requires consideration of arbitrary inputs is helpful in eliminating such errors. And second, use of input-enabling makes the basic theory of the model work out nicely; in particular, input-enabling makes it reasonable to use simple notions of external behavior for an automaton, based on sequences of external actions.[p. 203 of Distributed Algorithms]

There's no suggestion that I/O automata were specifically developed "to verify most industrial examples" (Lynch's book is, after all, quite theoretical in nature).

The impression I get from reading the early papers by Hoare, Brookes, and Roscoe, is that the ability to refuse inputs was included because it made it easier to model synchronization as communication (by providing a more abstract view of handshaking), while at the same time allowing input-enabled processes to be defined if desired (the choice being left to the modeler). I'd assume that the rationale for CCS was similar (although I haven't read Milner's early papers on CCS). Basically, they made a choice to operate at a different level of abstraction than I/O automata, with corresponding differences in their semantic models. Given how widely used CSP has been in industry over the years (possibly more so than I/O automata, although I have no hard numbers to back that up), I don't think it's reasonable to claim that the design of CSP makes it somehow less useful than I/O automata for verifying industrial examples.

Interestingly, although CSP theory is based on a notion of silent refusal to engage in certain input events, the little Lisp-based "implementation" of CSP (really more an operational model meant for experimenting with processes) included in Hoare's CSP book actually adopts an approach a bit more like the one advocated by Brookes and Nain/Vardi:

If the symbol is not a possible first event for the process, the function gives as its result a special symbol "BLEEP , which is used only for this purpose. For example, since STOP never engages in any event, this is the only result it can ever give, and so it is defined

STOP = λ x • "BLEEP

[p. 17 of the PDF version of Communicating Sequential Processes]

The prefixing, choice, and other operators are similarly defined to produce "BLEEP if an attempt is made to engage in an event outside the set that they are prepared to accept. However, production of "BLEEP doesn't actually allow an output to be consumed, i.e. a process attempting to output a symbol a will be blocked if the intended receiver only produces "BLEEP in response to a events, and processes cannot themselves observe "BLEEP (only the person using the model can do that). The result is that the model is always receptive to user input, but might produce an audible bleep if the user provides an input that the model isn't ready to do anything with.

IO automata vs CSP

There's no suggestion that I/O automata were specifically developed "to verify most industrial examples" (Lynch's book is, after all, quite theoretical in nature).

Great post. I am very sure that CSP/CSS is used more than the IO automata model for industrial verifications. The IO automata model never became that popular.

The IO automata model was developed after CSP. The advantages of IO automata over CSP are input-enabledness and fair-trace semantics; its just a very elaborate model which tried to solve some 'problems' other models couldn't easily solve. I guess in IO automata, the emphasis is on behaviour and not on message passing semantics.

As far as the "industrial relevance" goes, I guess you're right and I read too much into it.

Uh, so what?

Not quite. Many process calculi (most prominently CCS and the pi-calculus) use bisimulation as a proof technique to establish an equivalence between the labelled transition systems that provide the underlying semantics of their process model. It's not the same as trace inclusion, and in fact provides a stronger criterion for process "equivalence" (processes may have identical trace sets, but not be bisimilar).

Yeah, but what does bisimularity mean? I think of bisimularity solely (mostly) in terms of observational equivalence, not in terms of being to partition states/terms in equivalence classes.

I accept the argument that for specific traces observed, it may be hard, or impossible, to give a bisimulation proof.

On the other hand, I don't see what other proof could be given for trace equivalence; I think that the bisimulation principle usualy can be understood to be equivalent to being able to give an extensionality proof on traces (on all time points), and I wouldn't be surprised if these proof strategies mutually imply each other (the same as Noetherianess is equivalent to well-foundedness and implies that an inductive argument can be given).

And then, even if bisimulation as a fundament implies a less expressive logic than a logic based on trace equivalence (now I am grappling with that difference), I suspect that probably is irrelevant to real-world examples. The structure of the formulae and systems probably imply that bisimulation (or refinement) proofs are just the only natural way to handle specifications over process descriptions.

[ hmpf, one also uses bisimulation to prove trace equivalence, this is different than branching time logics, hence the confusion. Bisimulation != branching ]

Re: The meaning of equivalence

Hello Allan,

I'm afraid I'm unable to see how !x.?x and 0 are contextually indistinguishable (I don't have much experience with asynchronous process calculi). I'm guessing that !x and 0 are indistinguishable in the sense that neither causes an attempt to output on x to block (since communication is asynchronous), but I'm not sure why the ?x part of the first process would not have some effect on at least some contexts (for example a context that accepts an input on x) that 0 could not produce. Could you please explain why the two process cannot be distinguished contextually.

Your guess is right: in asynchronous CCS (or like calculi) there is no context where the difference between 0 and !x.?x can make any difference to the context. Thus the usual notion of trace here is too fine-grained. What this really means is that the conventional labelled transition systems are not quite right. Finding the right ones can be non-trivial. I think it took about a decade between the invention of the async. pi-calculus and M. Merro's discovery of a labelled transition system for this calculus that matches the reduction congruence precisely.

(Of course at this point one may ask: why reduction congruence etc)

Another paper about the topic

A few years ago, I published a paper at the CPA conference (The conference of the CSP community, still alive !) about linear-time vs. branching time semantics. The idea was to decorate traces with locations to encode the branching structure of process behaviors. There are some quirks in the paper (the mobility features of the pi-calculus are not modelled in a totally satisfying way, to be frank) but a PhD student is working with me on a much improved version of the model (cf. this technical report).

The paper On Linear Time and Congruence in Channel-passing Calculi is available online.

Response by Robin Milner

A response by Robin Milner to "Branching Time vs. Linear Time: Semantical Perspective" is posted at http://www.cs.rice.edu/~vardi/papers.

Moshe

Direct link

Updated paper

Thanks for the update, Moshe! It's interesting to see what Milner's thoughts on your paper were. I also see that there's now an updated version of the paper, which looks like it may have benefited from some of the discussions here. Nice to see LtU having an impact :)

As an aside, regarding your footnote on page 6 of the new version: I'd like to note that although Hoare's 1985 CSP text included "BLEEP in his operational model, none of the more recent descriptions of operational semantics for CSP (e.g. in Roscoe's Theory and Practice of Concurrency) appear to incorporate "BLEEP or anything like it.

Our follow-up paper

We've recently extended the technical results of this paper (showing full abstraction of trace semantics for synchronous transducers) to probabilistic and asynchronous transducers.

For details see: Trace semantics is fully abstract (LICS'09)

Sumit