What is computation? Concurrency versus Turing's Model

Concurrency is of crucial importance to the science and engineering of computation in part because of the rise of the Internet and many-core architectures. However, concurrency extends computation beyond the conceptual framework of Church, Gandy [1980], Gödel, Herbrand, Kleene [1987], Post, Rosser, Sieg [2008], Turing, etc. because there are effective computations that cannot be performed by Turing Machines.

In the Actor model [Hewitt, Bishop and Steiger 1973; Hewitt 2010], computation is conceived as distributed in space where computational devices communicate asynchronously and the entire computation is not in any well-defined state. (An Actor can have information about other Actors that it has received in a message about what it was like when the message was sent.) Turing's Model is a special case of the Actor Model.

A Turing Machine can enumerate all the possible executions of a closed system even though it cannot perform them individually. For example, the integers are recursively enumerable even though a non-deterministic Turing Machine has bounded non-determinism (i.e. there is a bound on the size of integer that can be computed starting on a blank tape by an always-halting machine). Proving that a server will actually provide service to its clients depends on unbounded non-determinism. That's why the semantics of CSP were reversed from bounded non-determinism [Hoare CSP 1978] to unbounded non-determinism [Hoare CSP 1985].

See the article at: What is computation? Concurrency versus Turing's Model

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Probability measures

It looks to me like Turing machines with bounded non-determinism are a reasonable model of reality, so long as you assign independent probabilities to the execution branches. In reality, networking processes that rely on unbounded non-determinism for their correctness actually allow the possibility of non-termination - it's just on a set of probability measure zero. The actual probabilities aren't important - only that each branch has probabilities bounded away from 0 and 1.

BTW, your link is messed up.

If you assign probabilities to non-deterministic transitions

You end up with something completely different. Not very noticeable at first, but very different altogether.

Just imagine an unfair coin which determines a 0 or 1, and derive bit strings for that. Not only did you model something completely different (000 might have a low chance of occurring) but the language described is different in its limit.

Often, it just doesn't make sense. Imagine a plant controller which monitors a valve which is open or closed; one would normally model that as a nondeterministic process. Since the valve is supposedly controlled by a third party, it may be impossible or just darn-right counter-intuitive to assign probabilities to that. In a non-deterministic process, one day it may open/close 1:10000, one day it may open/close 10000:1, one day it may decide to recite the binary encoding of the Matheus Passion.

Another example: ONE-SAT. The set of boolean satisfiability formulas which have a unique solution. Why, and how, would you like to assign probabilities and where will it give you any extra insight into the formula? [ This is trivially solved by a non-deterministic machine; for a probabilistic machine, not so, or -I gather- with low probability. ]

You can't just replace non-determinism with probabilism. They're just different beasts.

[ It becomes even worse when you have mixed non-deterministic and probabilistic algorithms. Just replacing the non-determinism in those with a stochastic interpretation means you, in general, messed up the model beyond recognition. ]

Probability vs. probability zero

I was a little terse, but my point in the last sentence was that actual probabilities (so long as they stay epsilon away from 0/1) don't matter for determining the events of probability zero. So you don't actually need to pick probabilities to study the sets of probability zero. If you actually need probabilistic branching, I agree that's a different beast.

Why is this relevant? (This is partially a response to your other post elsewhere in the thread). This is how guaranteed delivery of messages works in the real world: we attempt to send a message, and if that send fails we retry. It is possible in the real world that the send will never go through - each send attempt will randomly fail, forever. That possibility just has probability zero. Carl Hewitt is arguing that a non-deterministic Turing Machine which can run arbitrarily long can also run forever, which is true. But that's the reality of real world concurrent systems.

Buechi Automata

Ah, now I get it. It's the infinite behavior and fairness debate again. But this has been recognized for years right? I.e, one may also often use Buechi automata to model concurrent systems and not Turing-machines since Turing machines fall short at modelling infinite behavior.

Ah well, looks like he's banging a pretty old drum there.

Proving that a server provides service

The issues is *proving* that a server provides service to its clients. That's why the semantics of CSP were reversed from bounded non-determinism [Hoare CSP 1978] to unbounded non-determinism [Hoare CSP 1985].

Fairness

Well, you can't get away proving that [a service is provided] without a notion of fairness. The 'a turing machine must halt' assumption is essentially a fairness condition, and from the example on the posted page, producing an unbounded integer, it seems that they modeled fairness also into that model. (Was that CSP? I don't know CSP that well.)

I.e, [it looks to me that] their notion of fairness is pretty close to normal fairness, if a transition is enabled infinitely often, it is taken infinitely often. From the example I gather they limit themselves to finite traces. But I wonder whether their notion of fairness just is 'one doesn't consider infinite traces where an enabled transition is never taken'. (Which is pretty close to normal strong fairness which, stated equivalently, says that from anywhere in an infinite run an enabled transition will always be taken.)

[Rewrote some incoherent musings.]

Real world of concurrent systems...

Regarding this 'reality of real world concurrent systems' you mention:

how guaranteed delivery of messages works in the real world: we attempt to send a message, and if that send fails we retry. It is possible in the real world that the send will never go through - each send attempt will randomly fail, forever. That possibility just has probability zero.

In the real world of concurrent systems, there is such a thing as permanent node failure (with non-zero probability) and this is difficult to distinguish from temporary node failure. This is one of those problematic issues that tend to piss all over the actors model semantics, even before we deal with network partitioning. I found actors model to be nice and elegant until I had to start dealing with partial failure, partitioning, recovery, and related concerns.

Code and implementation quickly grow ugly when individual 'actors' must explicitly concern themselves with timeouts or message-loss issues. With partitioning, it is unclear whether the message failed to deliver, vs. the acknowledgement failed.

But, I'll grant, this rather weakly related to your point on probabilities.

For probabilities, I would say only that providing your 'epsilon' can be quite difficult. If estimated probabilities degrade as function of the number of attempts, then the estimated probability for one success in an infinite stream of attempts may lie somewhere between one and zero. It can be difficult to reason about, say, probability of progress for a large transaction; conflicts may easily beget conflicts.

Real world many-core concurrent systems

In many-core concurrent systems messages are always delivered and partitioning never occurs.

How long can we have that guarantee?

It seems like we're rapidly approaching 6nm dies and the manufacturers of many-core CPUs are trying to come up with prototypes to figure out what 1,000,000 core CPUs will look like. I've heard some predict we'll have to deal with partitioning. The reason being that engineers won't be able to figure out how to power the entire chip. As far as I know, nobody has solved this hypothesized future power problem.

It seems you are saying that guarantee will hold forever. e.g. Either the chip manufacturers will solve the problem, or compiler writers and system engineers will solve the problem at the software level somehow such that it will be a non-issue. I can't see the latter happening due to global requirements that seems to impose.

Moreover, in the cloud, we don't have that guarantee; the speed of light is your bottleneck and to do a round-trip from opposite ends of the world suffers extreme propagation delay, even with protocols designed to effectively minimize propagation delay.

Actor model will get better.

The Actor model is very close to the laws of physics. So scaling should increase its applicability by comparison with other models that make artificial assumptions.

However, inconsistency robust logic will be needed. See the following:
Common sense for concurrency and inconsistency robustness using Direct Logic(TM) and the Actor Model

Get better - how?

If the "Better" is inconsistency robust logic, then how do you see people designing systems with that feature? (I have some ideas, but you haven't stated yours with real-world applications...)

Would you mind if we continue this via private discussion? http://carlhewitt.info doesn't actually seem to point to anything meaningful, so I don't know your present contact info.

I don't necessarily know about the argument "scaling should increase [the Actor Model's] applicability [due to similarity with laws of physics]." Simon once argued for procedural rationality in place of substantive rationality in economics, because substantive rationality could not actually explain most microeconomic activity. He had convincing examples. You need something similarly convincing. So far your tutorials aren't that great. While Direct Logic is interesting, you appear to be missing the killer app for your ideas. I have yet to see what you are doing with this in the cloud.

My major motivator for bringing up Simon is that I feel engineers intuitively know some problems are hard, and rather than solving them directly through standard classical math and standard models of computation, they use human processes to solve these problems. I feel this is the spirit of Simon's idea about economic agents, and that the truth is we have a broader continuum of economic agents to think about when designing 24x7x365xInfinity systems. I also think most real-world software systems have lots of hidden information that needs to be accounted for that most languages don't account for. (I see some signs of languages that are moving in this direction, some in OOP and others in term rewriting.)

In many-core concurrent

In many-core concurrent systems (the really large-scale ones, seen in clouds and super-computing systems), cores fail with statistical regularity - even before we note related issues of power or HVAC failure.

I do agree it is simpler to have the implementation maintain the redundancy and avoid partitioning, in these cases (relative to the general distributed case), but one is still fighting with the model to achieve those properties.

The prevalent indeterminism in the actors model makes it rather poor for maintaining a redundant computation or recovering after failure. But this issue, at least, is easily resolved: one can always provide a deterministic implementation of a non-deterministic computation model, or use 'islands' of determinism (with each actor living in an island) akin to Croquet's islands or E's vats.

Actors model is also highly stateful: messages in transit introduce stateful queues and carry snapshots of prior states, actor state is necessary for coordination of multiple services or multiple input sources, explicit caching is needed for expensive queries or fusion of callback or observer-pattern data, and new actors have new handles and introduce state in routing services. All this state is inefficient to make redundant or resilient under partial failure conditions. How do we track which messages have been delivered, which will need to be recomputed? How do we protect against the case where a message sent contains a handle to a 'new' actor that is lost due to node failure shortly after its creation?

We need models to be resilient before they'll truly support scalable real-world concurrency. The issues of partial-failure aren't as important in-the-small, but we cannot ignore them in-the-large.

Croquet islands and E vats are awkward

Croquet islands or E vats are awkward by comparison with the "Swiss cheese" model in ActorScript. See the following:
ActorScript(TM) extension of C#(TM), Java(TM), and Objective C(TM)

In many-core concurrent

In many-core concurrent systems (the really large-scale ones, seen in clouds and super-computing systems), cores fail with statistical regularity - even before we note related issues of power or HVAC failure.

Do you have any references to this for same-socket (i.e., non-cluster) machines, or perhaps did you not really intend to write many-core? E.g., if you have a bunch of GPUs slung together (and pretend they're on one socket), I don't think you're worried about any one node failing.

The current emerging model for node failure in the cloud setting (dynamic requests for jobs consuming many HW resources) seems to be significantly reducing the burden for most software: post-fab screening means any initially broken cores are already off and 'smart' data centers alert hardware maintainers of new problems and prevent rescheduling on the same broken node. One job might go down, but you (or your runtime or your service user) just restart it elsewhere. We're now seeing a lot of discussions of performance failures -- stragglers -- but the initial stir about them was misleading. Apparently the cloud community has been doing a surprisingly bad job at their runtime systems: symptoms like stragglers mostly exist because of bad scheduling (i.e., work partitioning), not hardware failure.

Component failures are

Component failures are rarely independent, I agree. For 'large scale' I mean 'many cores per chip * many chips per box (or rack) * many boxes/racks per centre * many centres', which adds up to a lot of cores. Whether I'm right to call this 'many-core' is, I guess, an issue of jargon that I've not bothered learning. I've only discussed the subject with another who works in a supercomputing centre, not done any research on the subject myself.

In one discussion, regarding a group that was performing operations on high-resolution satellite imagery, they describe node failures as one of the bigger problems they needed to resolve - typically, one or two nodes were failing per experiment. Since the computations are pure, that's a task that's relatively straightforward to restart, but it wasn't something the developers had initially anticipated. I've also read adverts on supercomputing elements that proudly describe hot-swappable CPUs - I couldn't help but imagine that 'hot' was probably no joke.

Whether you can 'just restart' a task really depends on the nature of the task - how much communication and state is involved with clients and third-party services, for example. I've never understood how people casually accept losing work, leaving interactions in an ill-defined state, often requiring administrator or human intervention. Rather than restarting, we should continue the task as though it were never interrupted, or lose at most some well-defined amount of work.

I agree on the scheduling issue. I've not done enough to discuss scheduling on super-computer scales, but even at the small scale, on the desktop we could do a LOT better at controlling how far some tasks straggle behind others. That's something I'm working on at this moment (developing a scheduler for a paradigm with temporal semantics).

For some values of always

For some values of always and never.

Assigning probabilities doesn't help

Assigning probabilities to non-deterministic transitions doesn't help because:

* Performing billions of operations per second means that even very low probabilities are intolerable
* Underlying correlations can raise probabilities to unacceptable levels.

Broken Link

The hfref tag doesn't work for me.

Re: broken link

If change <a hfref="…"> to <a href="…">, then the link becomes http://knol.google.com/k/carl-hewitt/what-is-computation/pcxtp4rx7g1t/30#.

Great Voodoo

I have no idea what the guy on that link is talking about.

There is a bound on the size of integer that can be computed by an always-halting nondeterministic Turing Machine starting on a blank tape.

Why? The trick is in the always-halting assumption? [ Ah, I see. ]

Always halts

Because the tape starts blank, the NTM is just an NFA with an infinite scratch tape at its disposal. But there must be some fixed upper bound on how much of the tape it will use, because if there were no such upper bound, the machine would not always halt. So it's an NFA with a *finite* scratch tape, which is equivalent to another NFA with no scratch tape and a larger number of states. [ Like you said :-). ]

Closed Actor system is not necessarily a NFA

A closed Actor system (i.e. has no input) is not necessarily a nondeterministic finite-state automaton (NFA) even though it always halts.

Closed Actors System?

I would think it trivial to create a closed actor system, with one message in the closed system's initial state, that never halts - simply passes messages forever in a cycle.

Are you presenting the rather trivial argument that a program that never starts, starts halted, and thus will always halt? Or do you mean something else? (a particular application?)

NFA cannot implement closed Actor systems

Nondeterministic Finite State Automata (NFA) cannot implement closed Actor systems follows from the fact that Nondeterministic Turing Machines cannot implement closed Actor systems.

It is sufficient to prove that there is a closed Actor system that cannot be implemented by a Nondeterministic Turing Machine.

Thankyou for drawing me into

Thanks for drawing me into the Actors side of this. Scrutinizing the Actors half of the paper's integer-bounding argument — I think the paper is drastically oversensationalizing the relationship between TMs and Actor systems.

For a suitable formalization of "always halts", a given NTM starting with blank tape that "always halts" can only produce integers up to some finite upper bound. Check. But the only way to get the claimed result about a closed Actor system that "always halts" producing arbitrarily large integers is to change the formal definition of "always halts". That's a genuinely interesting theoretical result, but IMHO it's nothing remotely like the 'Turing Machines aren't powerful enough' result that the paper seems to claim.

Given a closed actor system, I can construct an NTM that will simulate it, such that the simulating NTM will halt when the simulated Actor system halts. It will be a preposterously inefficient NTM, but this is about computability, not complexity. According to the paper, the reason the closed Actor system can generate an arbitrarily large integer is that an actor can asynchronously send itself a stop message, and then keep incrementing until it receives that message; but that just means that there is no upper bound on how many increments there may be time for before the stop message arrives. I can simulate that just fine with my NTM: I'll record on the tape the fact that the stop message was sent, and then at each step of the simulation I'll nondeterministically decide whether or not the stop message in transit actually arrives. This has, of course, violated the sense of "always halts" that was used to bound the size of integer that could be produced by an NTM. What it demonstrates is that explicit nondeterminism of decisions in an NTM can simulate the implicit nondeterminism of asynchronicity in an Actor system.

That's an interesting insight into the nature of asynchronicity. The paper makes it sound like some sort of refutation of the Church-Turing Thesis.

Nondeterministic Turing Machines cannot implement Actors

Nondeterministic Turing Machines (NTM) cannot implement Actor systems.

A Turing Machine can enumerate all the possible executions of a closed Actor system even though a non-deterministic Turing Machine cannot perform them individually. For example, the integers are recursively enumerable even though a non-deterministic Turing Machine has bounded non-determinism (i.e. there is a bound on the size of integer that can be computed starting on a blank tape by an always-halting machine).

Apples and oranges

Nondeterministic Turing Machines (NTM) cannot implement Actor systems.

As I explained in my previous post, this is only true if you put a stronger termination condition on the NTMs than you do on the Actor systems. To compare the two on a level playing field, you would have to either strengthen the constraint on the Actor systems, or weaken the constraint on the NTMs. Here are possible ways to do each:

• In each Actor system, for each type of message, specify a fixed upper bound and fixed (non-zero) lower bound on how much time will elapse from sending to receiving the message. It is then no longer possible for the (closed) Actor system to generate arbitrarily large integers.

or

• Rather than requiring that there be no infinite execution path for the NTM, require that for every execution path there exists a suffix that would result in halting. It is then possible for the NTM to generate arbitrarily large integers.

Too weak

• Rather than requiring that there be no infinite execution path for the NTM, require that for every execution path there exists a suffix that would result in halting. It is then possible for the NTM to generate arbitrarily large integers.

This is a little weaker than the condition I suggested of terminating with non-zero probability, since you can have a program that performs two non-deterministic branches and terminates if they both take the left branches, or else performs three branches and terminates if they all take the left branches, or else performs four branches and terminates if they all take the left branches, else ...

In a 50/50 probability model this program will only terminate 1/4 + 1/8 + 1/16 + ... = 1/2 of the time, and yet satisfies your tail condition. I think "terminates with probability 1" is the condition you want on NTMs.

Possibility versus probablity

Hm. Well, your constraint does imply mine: If some finite sequence of choices guarantees subsequent nontermination, then the probability of nontermination is bounded away from zero. And there are, as you point out, actual machines that satisfy my constraint but not yours. So the set of machines satisfying my constraint properly contains the set of machines satisfying yours.

But do these two sets differ in computational power? Yours can already generate arbitrarily large integers. It's not immediately obvious to me that, for any of my machines, one couldn't construct one of yours.

My argument isn't mathematical

We can certainly choose whatever definitions we want, but why are we interested in requiring termination at all? I'm just arguing that termination with probability 1 better fits the spirit of the requirement than the tail constraint.

We want to prove that a server provides service to its clients

The issue is that we want to prove that a server provides service to its clients.

If you allow NTMs which

If you allow NTMs which non-terminate with probability 0 then you can implement the message passing you have in mind, and prove whatever theorems you have in mind that hold for actors. Furthermore, this loosening of the termination condition is realistic.

Concurrency versus nondeterministic choice

Indeterminacy in concurrency does not result from making the kind of non-deterministic choices in Nondeterministic Turing Machines. Concurrent computation is more powerful than computation that results from such choices.

We are dealing with mathematical theorems.

Both the Actor model and Nondeterministic Turing Machines (NTM) are precise mathematical models that can be used to prove theorems.

It is a theorem that Nondeterministic Turing Machines (NTM) cannot implement Actor systems.

If you switch definitions around then you will get different models that may or may not be useful.

It's fairly commonplace for

It's fairly commonplace for theorists working with NTMs to explore various termination criteria. (Prob'ly the most baroque variant I've seen is Alternating Turing Machines.) In this case, there's a specific reason to be especially interested in varying termination criteria, namely that the theorem pivots on those criteria. The power distinction drawn by the theorem can be collapsed by straightforward variations of criteria in either model (Actor system or NTM). By projecting either way through the correspondence between these variations, the power distinction itself can be understood in terms of contrasting variations of just one model or of just the other.

This is what mathematicians do. First we discussed exactly what your premises were and why they led to your conclusion, and then we explored the ramifications of your theorem by studying what changes in premises cause it to break.

Fixed broken link

I fixed the broken link. Sorry for the error :-(

No nondetermiistic choice in concurrency means so fairness issue

Because there is no non-deterministic choice in concurrency, the issue of fairness of such choices does not arise. Concurrency cannot be modeled as a non-deterministic state machine.

Uh?Concurrency cannot be modeled as a non-deterministic machine

Concurrency cannot be modeled as a non-deterministic state machine.

What? In general or in case of the Actor Model? I am pretty sure most of the people in concurrence would strongly disagree with this. What about Buechi automata or Nancy Lynch et al MIT IOA model?

Even the examples in the post read (best) as the parallel composition of non-deterministic machines.

[ I grant you that in concurrency one elaborates the meaning of non-deterministic automata to meaning over infinite runs, but that is pretty standard. Not taking into account the standard models for concurrency leads nowhere. Why bother with classical assumptions when it is known for years that standard automata just fall short in modelling concurrency? ]

Choice vs. Concurrency

Actors model, IIRC, has some sort of fairness condition on message delivery - that a message cannot be delayed indefinitely by the processing of other messages. If we were to model actors on a non-deterministic machine, we could do so by making the choice in question be the order for message processing. By doing so, would we not see a direct correspondence in the fairness conditions for ensuring progress?

But I think your statement is simply wrong for another reason: concurrency isn't about choice; it's about simultaneity - of the semantic variety (with parallelism being simultaneity of the operational variety). Indeterminism and non-determinism are orthogonal to the issues of concurrency.

Concurrency is more purely expressed with temporal semantics, which can clearly model which ongoing behaviors are simultaneous, which overlap in time, et cetera. This sort of concurrency is available with temporal concurrent constraint, temporal logic, temporal databases, functional reactive programming, reactive demand programming. Single-assignment variables provides yet another model for deterministic concurrency. These models illustrate that concurrency isn't about choice.

Actors model forces a marriage between concurrency and choice. This leads to a great deal of accidental difficulty when it comes time to coordinate simultaneous workflows, behaviors, services. The difficulty exhibits itself as explicitly stateful queuing and wait patterns in addition to the fairness condition for eventual delivery of messages.

If the choices of Actors model cannot be modeled on a non-deterministic finite state machine, it is only because of the 'finite state' issue, not because of the 'non-determinism' issue.

Simultaneity doesn't exist for events

Events at different places never happen at exactly the same time.

Simultaneity exists.

Cars move down the road, simultaneously. You and I are alive, simultaneously. My shoe is black and on my foot, simultaneously.

It is unclear to me how you're defining 'event' and 'time' such that only one event in the entire universe can happen at a time.

But, in terms of concurrent computation models, simultaneity can easily be formalized. Temporal semantics make it obvious, but even without temporal semantics we can identify temporal equivalence classes for events and temporal overlap for activity and thus identify what is 'simultaneous'.

Activities can overlap in time

Yes, activities can overlap in time. Sometimes this depends on frame of reference. A paper on how to represent this has been published here:

Norms and Commitment for iOrgs(TM) Information Systems: Direct Logic(TM) and Participatory Grounding Checking

An arbiter can be in no well-defined state arbitrarily long

The problem is that an arbiter can be in no well-defined state for an arbitrarily long period of time.

Then throw away the model

[ Snipped bollock comment. ]

Yah, I see it now. The not well-defined state comment set me on a wrong track. Naming it such is pretty counter-intuitive since I assume there is a (global) state description for any ActorScript system, even if it would be informal.

Unbounded Non-determinism

Any observable thing including a concurrent computing "system", cloud or otherwise, is determinate to the extent that it can be described as a system. Systems normally require the introduction of suitable constraints, this is surely also the case with software artifacts.

The early history of "software" was concerned with elimination of the "hardware" constraints used in early computers. The development of the Manchester Computer which introduced the first random access memory was a breakthrough in which Turing himself participated. As useful as this is, software artifacts have constraints in the same way as physical things. Working software needs constraint. These are typically introduced late at night when no one is watching.

actors not more powerful than TMs

Consider a non-deterministic Turing machine that either halts leaving behind an arbitrary finite string of "1" characters ... or else never halts.

Now, consider an Actor which, given the solitary input of a "start" message will, as Hewitt describes, start counting at: zero ("") -- send itself concurrent GO and STOP messages -- keep incrementing the count for each GO received ("1", "11", ...) -- finally, send a reply to START indicating the number of GO received when STOP finally arrived.

That Actor computes any finite string of "1" characters .... or else hasn't yet halted. Exactly like the TM.

The two are exactly the same thing. Theorems about one translate trivially into the other.

Hewitt says its not a fair comparison because the Actor is guaranteed to halt while the "1*" Turing machine is not. Yet, for any finite time K, neither is guaranteed to have halted. Actors might make a slightly nicer notation for some proofs (and TMs for others) but they are not very different at all.

So it did boil down to fairness?

I.e., in the Turing machine an enabled halting transition should once be taken, and in the Actor model, a posted message should once be processed?

Repeated nondeterministic choices versus eventual arrival

It's more a contrast between:
* Repeated non-deterministic choices in Turing Machines
* Eventual message arrival in the Actor Model similar to the assumption that a Turing Machine will eventually make the next state change

a better foundation is needed

It's more a contrast between:
* Repeated non-deterministic choices in Turing Machines
* Eventual message arrival in the Actor Model similar to the assumption that a Turing Machine will eventually make the next state change

Where is the contrast?

I mean, you are saying that certain proofs rely on so-called unbounded-nondeterminism as found in Actors but not TMs. Can you produce some rules of inference that apply in the Actors case but have no easy analog for TMs? I don't see any.

Actor Model Proof Theory

There is an extensive published literature on proof theory for the Actor Model. See the following paper and its references:
Actor Model of Computation

re: Actor Model Proof Theory

Thanks for the pointer to your earlier paper. To my ear, the analogies to physical computation of the Actor model are the stronger rationale and the bit about proof power is just not quite right. ("Proof convenience" might be convincing.)

I like that you predict / call-for tagged architectures. (I'm skeptical about extensions to x86 being where we go. The reason is that the biggest investors likely to drive tagged architectures care (I think) less than you might think about x86 compatibility -- so why bother with the complexity. They'll likely care a lot more about power consumption -- a reason to positively avoid x86.)

analogy considered harmful

Re: comment #63286:

To my ear, the analogies to physical computation of the Actor model are the stronger rationale…

Note to self apropos nothing in particular: Analogy Considered Harmful by Frank Halasz and Thomas P. Moran:

Analogy, used as literary metaphor, is effective for communicating complex concepts to novices. But analogy is dangerous when used for detailed reasoning about computer systems — this is much better done with abstract conceptual models.

A delightful paper that it is only 3½ pages long, which — for my money — is almost as good as a two-page paper.

Just sharing

My apologies in advance if this can appear to some LtU members as some kind of flame-bait, but I do feel I have to insist upfront on this is truly not my intent in any way in what follows here below.

Also, I should reiterate about the important fact that I clearly am not carrying as much in academia credentials as a number of you who debate regularly about such important topics (that is, "important" if only w.r.t computing science research results in general, and PLT's more specifically, and their direct or indirect applications to our so-called "real-world" problems).

(I.e., on these rather abstract topics, unlike on others, I'm mostly, say, just a hobbyist.)

Now, this disclaimer being given...

I do not know whether Mr Hewitt's Actors Model actually need, or not, better foundation(s) in regard to its argument w.r.t its theoretical power vs. the Alan Turing's Machine.

But...

Over the last two years, I have spent a number of hours (in a regretably very sparse fashion, though, mainly because of this way too rare spare time that I can allocate to it) reflecting on my own intuitive ideas related to these topics, and still am in an ongoing process of providing them with a formalization worth of the name.

Of course, a lot of that time was also spent "in parallel" (no pun intended) (re-)reading numerous papers, or articles, forum threads, from the LtU community and its alikes.

So that I've reached a point where I am confident that I have found for myself at least one thing that might be now of interest to others, and I'm thus sharing here:

This is about a strong doubt — a doubt I have and which gets even stronger every new minute, new hour, ... that I spend reflecting and/or reading further on those topics:

yes, this is my very strong doubt that, in the end, and given the usual definitions of soundness, completeness, inconsistency, etc (and w.r.t the ways they are being considered useful when related to the various theoretical models of computation), trying to oppose one model of computation to others, on the basic implicit assumption (unconscious?) that the formal properties of what is devisable, then definable, computable, provable, reductible, normalizable, etc, in finite space AND time (my emphasis on "time" here w.r.t to a "known termination" in a more or less broad sense) WILL actually produce more useful computational models, along with possible implementations of theirs.

My belief (as I can't prove it, obviously) indeed is this kind of argumentative opposition, where one would expect to be able to conclude with one model being strictly "larger" (or more powerful or etc) than the other, w.r.t finite steps computation, won't provide us with anything more useful than what we have already (after Godel's, Turing's, Church's, etc works results), anyway, because this finite-time computation applicability criterion isn't, IMHO, the most determining one for resolving our present and future software-at-large issues.

[Even though it may be so (and certainly is, most often) when it is clear upfront that we do not speak of software-at-large at all, and instead have "only local" or "very local" algorithmic considerations.]

Let me elaborate, or rather rephrase this, actually:

I think that finite-time computation is clearly very often relevant and most useful (when not absolutely necessary) locally, but I now seriously doubt we will benefit, as much as we could hope, from trying so hard to absolutely ensure it also at a much larger scale, such as on top of a global infrastructure like the WWW (or significant subsets thereof).

Hence, what if the power (as it is known, taught, and made profit of in today's technology) of the UTM and it's infinite set of deterministic (or non-deterministic) offsprings accessible to our human imagination, in this seemingly never ending constructivist story of ours which consciously started with Godel, Church, et al.'s a little less than 80 or so years ago already, was largely enough ALREADY, actually, for our practical purposes, PRECISELY whether or not all of soundness, completeness, and consistency can be achieved at the same time?

Yes, what if a non-terminating UTM in the process of computing "something" we will never know, because indefinitely (i.e., because though it'd be built out of a finite number of states, we would also feed it with an infinite, time-dependent, input; where "infinite" could be read as "arbitrary long") would STILL be useful?

Huh?! What???

Yes: the computability of something as we generally accept it (well, if I didn't miss something) does seem to imply the termination of an instance of use of the rules of some (finitely axiomatized) formal system — and the system is complete when all tautologies are theorems, while it is sound when we have "only" all theorems be tautologies (which is generally "easier" to achieve than completeness, precisely because of our ambient constructivism, conscious or not).

As Godel's has shown us, though, if one is really concerned with achieving both consistency and computability in finite steps, we'll have to drop our wish for completeness at some point, which, as I see it:

practically means that some (hoped for, potential, yet unkown, etc) results we would like, at some point T in our time of knowledge and technology, to be computable in a T+dt future, won't be obtained eventually, without changing our formal systems.

But then, it is I think useful to recall about the nature of those results we're always interested in finding out about: the aforementioned theorems.

And this is where I came to the realization that when we speak about computation steps in a program, or about the lemmas and final theorems in formal proofs, the "practicality" (or "usefulness") criteria we choose to apply before calling something a theorem (important result) or "just a lemma" (an intermediary step, somewhat "less important" than the "theorems" we target) is pretty much a matter of point of view/the ontology we are in.

It is easily observable, I think, that some branches of mathematics accessible to our brains do re-use regularly, at a very practical-level for their purpose, in context, sometimes very closely to their final theorem (i.e., in the number of steps before the "QED") what are "just" some lemmas in another branch, even though they're farther (present earlier/higher up) in that other branch's own concluding theorems proof sequents.

Thus, I regard that the existence of a non-terminating UTM, or "just" a (non-universal) more specific NDTM, can in some way be considered as being an indeed useful device when its indefinite execution provides the computational equivalent of these new "lemmas", still interesting for a practical purpose of the computation, in a specific given intent of the formal system, though not seen "as powerful/general" (in mathematics parlance) as those theorems, in regard to other contexts.

Do we have already a concrete, tangible instance of such an UTM (or multi dimensional product thereof) running somewhere? Maybe we do, and the WWW would be a good candidate witness according to this assumption (or "conjecture", rather).

Then, I see more useful to investigate how the humble acknowledgement (along with the recognition of the full extent of its consequences) that the non-termination of this UTM is actually not a problem if we re-focus our effort on the scalability of formal language design for the future formal systems and their resp. languages to come to extend it (that UTM) by wiring them in software and hardware.

Also, in such a would-be scheme where unknown termination isn't seen any longer as a fundamental negative property for the computability theory's ontology you have at hands and you try to reify in hardware and software, you'll likely realize as I did, I believe, that in your effort to circumvent/workaround the incompleteness of your formal systems interpreted by that UTM (which is maybe the only one "global and unique" in the strict sense, btw) you may need to introduce along your way such notions as intent and purpose. This would be for at least filtering out (context-dependently of course) in a deterministic way the "uninteresting lemmas" w.r.t the given intent and purposes, again in the same analogy as above, vs. the interesting ones, out of this indefinite execution's by-products.

I believe it is clear enough that the will to implement such a scheme would then de facto imply the acceptance of recurring "ontological shifts" in regard to what we call today "computation":

thus, it wouldn't be so much about constraining the design of layers of these formal systems constructs we'd devise w.r.t to a (somewhat, so far) "sacred" prediction of termination, as it would be, rather, about constraining these layers w.r.t to different kinds of context-dependent "captures" (at discrete points, marked in time and space) of intermediate formulas the indefinitely running machine is instructed to keep as "interesting enough" — still using, though, some helper call-outs/call-ins to/from other more "familiar", yes, computation processes meaningful only when done in finite time (type checking, phrase-language membership problem, SAT-like solving, rewritings, normalizations, etc).

Finally, in regard to my future efforts at a possibly feasible (I hope!) "bootstrapping" of practical implementations of that sort of programme, I have only found (so far) J.-Y. Girard's Linear Logic(*) to be (likely?) the closest semantic framework able to support these ideas.

Or, Okay, maybe I just don't make sense to anybody but myself(?)

(*) I was glad to notice P. Wadler's recurring interest in LL (for other, more classically type-theoretic, and more seriously formalized purposes than mine of course), too, btw.

[Edit]
.a few typos/grammar fixed
.final mention to my interest in Girard's LL (I eventually noticed in late '08 only)

re: "fairness"

I.e., in the Turing machine an enabled halting transition should once be taken, and in the Actor model, a posted message should once be processed?

Fairness isn't important. We're comparing all permitted finite traces of the counting actor and all finite traces of the "1*" NDTM. We find out that they compute identical sets. We're done. Fairness is not an issue. (That is basically what Shutt is saying with the suggestion of comparing always-halting Actors to NDTMs that, in any achievable state, may halt in a finite number of steps.)

How do you prove that a server guarantees service, then? You don't. You prove that in any finite run service is guaranteed. This corresponds to the real-world observable that for any practical amount of time T, both the Actor and the NDTM counting machines can run for longer than time T. The assertion that the actor is guaranteed to halt is a non-constructive exercise of a choice axiom, picking a T without being able to ascribe any properties to it other than "finite". In short, if you don't non-constructively pick T, you wind up proving (quite sensibly) that either the server provides service or it doesn't terminate. (Of course, it might do both in cases more complicated than the counting Actor.)

Finite traces have limitatins analogous to finite state machines

Finite traces have limitations analogous to finite state machines. See the work by Roscoe at Oxford. It can become incredibly complicated.

And, yes, we do want to be able to prove that servers provide service to their clients (full stop). Otherwise, we are doing a dis-service to the software engineers. Of course, performance issues must also be addressed in due course.

re finite traces / foundational question

Can you suggest a particular piece by Roscoe to illustrate what you mean? I think we must be misunderstanding one another because I don't think there is anything remotely complicated about what I'm saying.

I understand you to be making a strong claim about the foundations of mathematics: that you can prove certain service guarantees with Actors but not with NDTMs. I have trouble believing that, here is why:

Corresponding theorems:

"The Counting Actor always replies." is like proving (as Shutt pointed out) "The 1* NDTM may yet halt even after any X steps for non-negative X."

"The Counting Actor can count up to any non-negative, finite N" is like proving "For every such N, there is exactly one halting trace of the 1* NDTM, and it counts up to N."

Interpreted constructively, those pairs of theorems say exactly the same thing.

Interpreted non-constructively, I suppose that the Actors version says more. It says that for every Counting Actor sent a start message, there exists a finite time interval T after which a reply has been sent. Unfortunately, no construction of T is provided, thus giving me no time for which to set my wrist watch alarm. Axiomatic belief in T in a real-world computation is a matter of faith. Most importantly, there does not exist any finite time interval S such that T is guaranteed to be smaller than S. This is not really much of a service guarantee.

I do like the Actor model as I understand it. In essence, you are modeling computing as finite state machines that can spawn finite state machines; each machine corresponding to a relativistic worldline; the world of observable events corresponding to message deliveries and receipts; uncertainty as regards delivery order but uncertainty constrained by a need to preserve causality.

There are choices in such a model: does uncertainty permit unsent messages to be received? does it permit sent messages to not be received?

As a practical matter (e.g., modeling computation within a chip vs. computation in an Internet Cloud) multiple permutations of those choices make sense:

unsent never received == assume secure communications
unsent maybe received == assume (deliberate or accidental) spoofing

sent maybe not received == unreliable communications
sent always received == reliable comms

The mathematical to physical mapping of choices in Actor semantics to real-world computation is appealing!

But for each of those choices of semantics, I can tell you how to read off all of your same theorems from theorems about NDTM. Universal computation remains universal. Actors aren't new and more powerful in that sense. All effective computations can still be performed by TMs. All theorems about Actors still have easy translation into (ND)TM theorems.

My horse just lost its shoe

Yeah, well, there are many definitions of fairness. Normally, strong fairness is defined such that one restricts oneself to proving properties over infinite runs of the system where each enabled transition is taken infinitely often, or proving properties over finite runs of it.

In the two-line TM example, you just restricted yourself to the one fair run of the TM according to the standard strong fairness definition from automata theory. Thus I could claim you are just using the standard fairness assumption.

Now, you can, and you did, rephrase that as: We just restrict ourself to finite runs, i.e. the halting state much be reached. I would just consider that another notion of fairness. Both notions coincide trivially for this small example.

But once mileage may vary on how to interpret the examples. I read Hewitt's post now as an historical comment on that the standard TM model is incapable of modelling concurrency, which is, well, sound but pretty well known.

Dodging the fairness bullet

One of the motivations for the development of the Actor model was to dodge the fairness bullet by not trying to base the foundations of computation on nondeterministic choices.

Yes, this stuff is fairly old now: The Actor model was first published in 1973. You can find a history here:
Actor Model of Computation

Only have to send one "stop" message.

When it gets the "start" message, Actor system sends itself one "stop" message and one "go" message both of which are guaranteed by the model to arrive sometime in the future.

After that
* when it receives a "go" message, then it increments the count and sends itself a "go" message
* when it receives a "stop" message, it sends the current count to the customer of the "start" message

# of "stop" msgs doesn't matter

Only have to send one "stop" message.

Sure.

Analogously, a NDTM "1*" machine only needs one non-deterministic branch:

state 0:  emit 1, move right, goto EITHER 0 or 1
state 1:  halt

There is no point in distinguishing "The Actor's STOP will be delivered in finite time" from "The NDTM may halt at any finite step". Theorems (and proofs) about each of these apparatuses trivially translate into theorems (and proofs) about the other.

Of course, you understand that your NDTM may not halt

Of course, you understand that your Nondeterministic Turing Machine (NDTM) may not halt, i.e. it may always do "goto 0" This is because of the the way that NDTMs were defined.

as might the Actor system

This is the same ability that the Actor system has. After any given finite period of time, it's possible that the Actor system has not yet halted. After any given finite period of time (measured in number of steps, for a TM), it's possible that the NTM has not yet halted.

The Actor system always halts but the NDTM doesn't

The Actor system always halts but the Nondeterministic Turing machine does not always halt.

Unsimple misunderstanding

I was sure we were somehow misunderstanding each other. I think I've found the problem.

If you walk into a room full of old-school mathematicians and start talking about "natural numbers" when you mean "non-negative integers", you're going to have problems, because your words don't mean to your audience what you thought they would mean. (For your audience, zero is not a natural number.) If you step into the world of TM-based computability theory, and say, with reference to the systems/machines we're discussing,

The Actor system always halts but the NDTM doesn't

you're going to have problems, because your words don't mean to your audience what you thought they would mean. It appears you've been working in a community of mathematicians who routinely bandy about infinite strings (the "fairness" crowd clearly have no problem with them) — but by making computability claims about NTMs, you've stepped into a realm where (as I've seen NTM computability theory practiced, and apparently I'm not alone) there is no such thing as an infinite string. (Shades of Leopold Kronecker.) Even a TM's "infinite tape" isn't literally infinite, it's just allowed to grow without bound. In that conceptual framework, the distinction you're trying to make has no objective meaning: there is no such thing as an "infinite run" of the NTM (nor of the Actor system), and in every finite run, either the machine/system has halted, or it is still possible that it will halt later. Only by introducing the concept of an infinite string can you distinguish between "always halts" and "as time increases without bound, the probability of halting approaches one". It isn't true that the standard treatment of the NTM allows the non-halting infinite run; on the contrary, it doesn't even admit that run's existence.

[edit: spelling]

Infinite strings not needed; just infinite set of finite strings

Infinite strings not needed; just an infinite set of finite strings. See the following:

Computational Representation Theorem

This is not about the

This is not about the superficial question of how to represent infinite strings. Computability theory omits the concept of infinite strings, because it has no use for that concept, and therefore doesn't care how one would go about representing them.

The Wikipedia article section you refer to (your URL has a typo, BTW) confirms that you are doing what I said you were doing: it pointedly requires that infinite strings must be represented. There may be some reason, outside the realm of computability theory, why it wants to assume that (my guess, from that section, is that the reason has to do with denotational semantics); but for purposes of computability theory that assumption makes no sense. Computability is concerned only with whether the machine halts in finite time, and consequently the infinite "meanings" that the article section talks about "approximating" are superfluous.

After any finite time, the Actor system is in a configuration that may or may not be halting, and the NTM is in a configuration that may or may not be halting. There is a bijection between (equivalence classes of) configurations of one and (equivalence classes of) configurations of the other, such that the bijection preserves the "possibly successor" relation between configurations, and corresponding configurations either both have halted, or both have not yet halted.

Note that this NTM has unbounded nondeterminism. And its computational capabilities are entirely treated, without ever introducing the concept of an infinite sequence of configurations. If one were to add those infinite sequences, computationally the only thing that would be added would be the possibility of letting the system/machine run for an infinite amount of time, and there is a word for that: impossible. It is the ultimate in impracticality, something that is absolutely certain never to occur.

If, however, you really want to retrofit the treatment of NTMs with this spurious notion of infinite time, it's entirely arbitrary whether you say that the "infinite non-halting" sequences are allowed or not. So the claim that the Actor system always halts while the NTM doesn't always halt is true only because the claimant decided to make it so.

[Note: Looking back over that, bijection of equivalence classes of configurations may actually be an oversimplification — but not in any way that actually affects my central point.]

A Nondeterministic Turing Machine has bounded nondetermiminism

Plotkin proved that a Nondeterministic Turing Machine (NTM) has bounded nondeterminism. See What is computation?
Actor Model versus Turing's Model
.

Mike Davey's TM

Re: comment-63283:

Even a TM's “infinite tape” isn't literally infinite…

That much is clear from the photo.

Great Product

Every CS department should have one.

re: The Actor system always halts but the NDTM doesn't

Then theorems about runs of Actors are the same as theorems about a trivially identified subset of runs of the NDTMs. You didn't get extra proof power from the Actors formulation.

Strictly speaking, Actor computations are partial orders

Strictly speaking Actor computations are partial orders; not just linear sequences of states as in a Nondeterministic Turing Machine (NDTM). See explanation above.

Actor partial orders

Do you mean Event Order (as induced by activation and receipt order for messages)?

A good NDTM translation might be this:

In a run of an NDTM, for each cell location, keep a log of the sequence of values written to that location, and the number of times the location is read between each pair of writes. If two runs of the same NDTM converge on the same machine state, tape location, and tape contents -- AND they have the same lists of writes and read counts for each location -- then let's call the two runs "Strongly Convergent".

The relative order of reads and writes to *distinct* locations can vary among Strongly Convergent runs - but not arbitrarily.

If in *every* Strongly Convergent run to some run R, the order of of some writes and reads relative to one another is preserved -- then the earlier writes and reads are said to causally precede the latter.

If neither of two reads or writes causally precedes the other, they are said to be Simultaneous.

Computatonal Representation Theorem

re: computational representation theorem

Yes, I did read that. It is, in effect, what I'm responding to there.

You wrote there that: "The upshot is that concurrent systems can be represented and characterized by logical deduction but cannot be implemented."

Well, no. That's just not so. That's not the best way (in my opinion) to look at why Actor semantics are interesting. (Yes, I appreciate the potential irony in *me* (a "who?!?") telling *you* (as in "you know that that's actually *the* Hewitt") this. I mean this with the greatest respect. Sincerely. Really. The Internet makes it hard to convey that. I think this is an interesting topic on which, oddly and uncharacteristically, you have a not quite right idea in one of your areas of famous expertise.)

If you give my opponent idealized hardware for building closed Actor systems, and me idealized hardware for building NDTMs -- I can build anything he can build. If you want to prove anything about one of his systems using actor formalism, I can prove the same thing with a reasonable NDTM formalism. The example proof strategy I gave above is good enough to see this: taking alternative possible tape content states and treating certain properties of their partial order in the manner that you and Clinger treat Event Order in Actor semantics. The main difference between Actors and Turing Machines is that in Actor semantics we default to non-deterministic machine specifications and we have a more interesting topology for the infinite state. We get to the same place in Turing machine formalisms by moving from infinite tapes to infinite graphs and by moving from deterministic TMs to meta-deterministic NDTMs ("meta-deterministic" as in "always go from tape state S to S' eventually but not always by the same intermediate tape states"). Basically, Actors have a more intuitive sense of physical space -- so its a nice formalism, but it ain't a more powerful formalism.

Actors are computationally more powerful than NDTMs

Actors are computationally more powerful than NDTMs. For example see the program with unbounded nondeterminism in What is computation? Actor Model versus Turing's Model

Constructive proof theory?

After looking over the "Triumph of Types" paper on the front page does the actor model have a constructive proof theory?

ActorScript has types that are Actors

ActorScript(TM) has types that are Actors. So reasoning is the same as for all Actors.