Process Network for Effects, Monad Alternative

Monads are an awkward effects model in context of concurrency. We get incidental complexity in the form of futures or forking threads with shared memory. The running program becomes entangled with the environment, which hinders persistence and mobility and debugging. So I sought alternatives in literature.

Kahn Process Networks (KPNs) seem like a very good alternative. From an external perspective, they share a lot of similarities to monads, except we get more than one input (continuation) and output (effect) port and thus can model concurrent operations without relying on effects or environment support. Internally, KPNs have a lot of similarities to free monads: we can compose KPNs to handle effects internally, translate them, etc.. Use of KPNs as first class values allows for dynamic structure and mobile processes.

The main feature missing from KPNs is the ability to work with asynchronous inputs. But it is not difficult to add time to the model, and thus support asynchronous messaging and merges in a style similar to functional-reactive or flow-based programming (and somewhere between the two in terms of expressiveness). I doubt this is a new idea.

I've written about these ideas in more detail on my blog:

Reactive KPNs with open ports or channels also make a better FRP than most, having a far more direct API for pushing inputs and pulling outputs deep within a network.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Effects as sessions, sessions as effects

This POPL 16 paper might be of interest.

Sessions and pi calculus

That was an interesting paper, though it isn't clear to me how to adapt session types to KPNs.

Pi calculus doesn't have the right properties to serve as an effects model for a purely functional language. In particular, pi calculus is non-deterministic (which means we must understand the evaluation function), and pi calculus introduces new first-class names with vau (which means we must concern ourselves with uniqueness of name representations, and the potential for names to enter or escape scope within context of monotonic processing, and sophisticated GC issues surrounding use of named state).

Coroutine modelling

Simpler coroutine modelling is worth study I believe. In Felix coroutines have arbitrary channel I/O, however there is no buffering so they represent synchronisation points. Buffering in general is an evil hack to work around lack of understanding of synchronisation :-)

One of the key things about coroutines is the observation that any use of more than one thread for a single CPU is a design fault. You should be using coroutines. Threads are wrong in principle and way too expensive in practice.

If you have to handle asynchronous events, two threads is enough. Anything more is a design fault. More specifically, one thread per device and one extra management thread is optimal, and anything more is a design fault. If you have 8 cores on your machine, and you see Firefox running 100 threads, you know the programmers had no idea what they're doing.

On a GPU the optimal number is a bit higher due to slow memory and fast context switching 3-1 oversaturation is optimal.

My point here is that it seems hard to understand concurrency when the strictly serial coroutine model is not understood. Coroutines replace concurrency with indeterminate ordering, but preserve a guarantee that the ordering is total (even though you don't know what it is).

The key point is that subroutines are inadequate: that includes functions and traditional procedures too. Coroutines exchange control by a different model that complements functional programming and eliminates callbacks.

The purely functional world is broken. But it cannot be fixed with "effects". It needs to be fixed by symmetry: incorporating its dual into an integrated model.

coroutines etc

KPN values could be viewed as somewhere between coroutines and generators, I suppose. Internal order of evaluation is nondeterministic, but this is not externally observable.

Regarding your comment on threads: performance isn't the only consideration. Threads are often distinguished by task or responsibility. Lightweight threads (oft under a kilobyte) make this easier, since the set of hardware threads can correspond to CPUs.

I generally disagree with your comment on functions. I've contemplated using KPNs or their reactive variants as foundation instead of functions, seeing as process networks can subsume functions while remaining deterministic. But doing so both complicates reasoning and sacrifices a lot of nice properties that are possible with first class network descriptions as plain old values, and evaluation as a clearly delimited KPN->KPN function.


My general claim is that functions are only half the story. I would never throw out functions. As you say, functional programming in some cases can make things simpler, easier to reason about etc.

I'm not asking to switch to cofunctions as a foundation, but to *merge* functions with coroutines into a single model. I readily admit I do not know the best way to do this.

I'm currently writing some code of non-trivial complexity which uses a mix of procedural, functional, and coroutine based code, to see how it pans out. Actually a parser library. I'm learning as I go. It's beginning to look very convincing. My design could be implemented entirely with functions, however the coroutine model has some very powerful advantages.

One of the hard things to grok is that there is no termination. Because we have cofunctions, which process coinductively, we're dealing with streams, and streams don't end. So your thinking has to change.

As an example I battled with the fact that parsing a string, one gets a set of parses. The set could be empty (parse fail), it could have one entry (unique parse), or it could have multiple entries (ambiguous). In a functional model this is trivial: you can use a list or some other data structure to hold the all the parses. The data structure is modelling a product type IN SPACE.

But you cannot do that with coroutines, because the product is TEMPORAL. You get the values over a channel in a temporal sequence and there is NO END. Values just stop coming. None come at all in case of a parse failure!

Now guess what? This is just perfect! It is much better for adjunctive composition that the functional model. If you have parser A and parser B, and you want to compose them to get parser A |-> B meaning first parse a prefix of a string with A, then send the output to B and parse a prefix of the suffix, then the coroutine model just walks all over the functional model. In the functional model, you start with a set of input positions, each one generates a set of output positions, and you iterate through the input set merging the output sets, then send them to B. In the coroutine model you just connect the output channel of A to the input channel of B. Done.

If a thread fails, it just doesn't generate any output. In the functional model failures HAVE to generate outputs. Typically one uses an option type.

The functional model is the wrong model, its trying to use an inductive data type to model a coinductive data type. Coinductive types do NOT live in space. They live in time.

I don't think one can easily claim the dual model is harder, without at least developing it to the level of sophistication functional programming enjoys, and even then, I'm still advocating a symmetric system.

The core problem with functions is that they're subroutines in which data and control flow are integrated: and this is also their primary strength. In fact a huge weakness of (conventional) procedures is that the "effects" have no type, only the input is typed. However subroutines just don't cut it as a general tool by themselves, because local state is lost on return unless manually saved, defeating the primary advantage of coupling data and control flow.

Coroutines change the coupling rules, so control can be transferred to the coroutine on both data input and output, instead of being lost on output as with a subroutine, and local state is preserved on both input and output. So actually a function is just a simplified kind of coroutine, which means coroutines in native form are harder to use when a function would do. When you leverage their structure more deeply they're obviously superior to functions, by the same reasoning.

The hard bit is designing a model that uses both, and lets you choose which kind of model to use, AND lets you interface the components AND lets you re-engineer the system.

I can do that. I am doing that. But there are two core problems for me at the moment: there's still a lot of boilerplate, and, most importantly, I don't have any real THEORY to guide development. I know general principles, but i have to reason the hard way about each construction, each time. [The basic model is just a subset of CSP, but CSP is very hard use as a reasoning tool. The sequential subset CSC should be about as hard as functions to reason about .. because basically it is just the dual theory]

Logic languages

This was the basic dream of logic languages, and it seems really nice at small scale in prolog. Write predicates that either generate a stream of answers or fail with silence. Composability is simple connecting streams together (where the sequence is multiplexed into some Vars).

But complexity is a bitch. Taking the product of generating sets and then filtering them afterwards to find solutions does not scale well. Debugging it is painful as well.

CSP / Occam without alternation?

KPNs seem to be CSP/Occam style programming without the alternation construct to select between channels. The other difference is that the channel capacity in the Occam model is locked to one, but this is unimportant as buffering processes can simulate larger capacities.

CSP originally started out without an alternation construct, but it was added later to model problems where inputs must be multiplexed in an unknown order. How are these problems modelled in KPNs?


KPNs as originally specified by Kahn don't support alternation.

The reactive/temporal variant I detailed in the second linked blog article can model alternation quite trivially between wait and read - wait on a set of ports then read each and repeat, emitting messages in the order read. (Indeed, the expressiveness issue of merging streams of asynchronous inputs is what motivated reactive KPNs.)

So what is the difference?

So the wait primitive that you've added then looks exactly like Occam on a system with a global clock. Is there some difference in the model, or are they the same?

KPN vs Occam

I see two big differences in the model and one in my proposed use of the model.

KPNs don't block on write, only on read. The reactive variant also blocks on the wait action if necessary to determine the next process time.

Any expression of alternation in reactive KPNs is inherently deterministic. In Occam, alternation is a source of nondeterminism.

Occam has a built-in notion of completion, which is necessary for sequencing as a general composition operatior. A KPN subnet has at best an ad-hoc notion of completion, and isn't generally clear when we can replace one KPN subnet by another. My intuition is that it is feasible to extend KPNs with completion and sequencing, but I haven't worked out the details. Sequencing would support a more dynamic, state-driven network structure.

I use KPNs as plain old values - a network description of general type KPN with a pure KPN->KPN evaluation function. This allows some interesting patterns, like passing KPNs as objects and messages, reflection and extension as effects, backtracking.

Message loss?

One of the nice properties of blocking writes in the context of fixed capacity channels is that messages cannot be lost. This simplifies concurrent code as a layer of ack / retransmit is not necessary in the code to make sure that data-flow and control-flow remain coupled. For small scale, low-level systems this simplifies development. In larger scale systems making the mechanism to guarantee transmission explicitly visible can be more useful.

What about boundaries to the system - alternation over messages on external interfaces to the system would inject nondeterminism whenever the devices are independent / have separate clocking domains.

First-class (executable) code as a value is an interesting departure. What does the type system for KPNs look like?

Fixed capacity, boundaries, types

Yes, a fixed capacity channel has some advantages. The ability to control memory requirements is among them. While plain old KPNs lack this property in general, some people use a variant with bounded capacity and push back. The cost is potential data lock where processes wait in a loop.

The reactive variant doesn't need bounded capacity channels because we can constrain computations in the temporal dimension (how many logical microseconds we compute) instead of the spatial (how many messages are enqueued). If a process produces unbounded messages in bounded logical time, that's an error to correct or (feasibly) prevent via type system or progress/termination analysis.

Bounded buffers are essentially a means to model a temporal constraint within a spatial structure. Explicit separation of spatial and temporal aspects of any concurrent system has potential to simplify this and other such things. That said, using bounded capacity channels anyway doesn't hurt in the common case.

For external interface, we can push messages into a KPN's open input channels, evaluate, then pull from open output channels. The reactive variant can additionally control advance of logical time on open channels. So we could model pushing a list of messages into a KPN all at once or adding some logical delay between messages. Alternation within the KPN is independent of whether messages originate externally.

An important feature of KPNs is they are monotonic, so this pushing and pulling can occur in parallel with evaluation. The client only needs to wait for an output message if no such messages are already pending. So this provides a useful model for incremental background computation.

Typing a KPN isn't something I've spent much time researching yet. Conventional FP type systems don't make it easy, but my intuition is that GADTs or heterogeneous collection type could be adapted, or KPNs could feasibly receive specialized support from the type system in a new language. A simple type system might just assign a message type to each channel. A "dependent" type system (or a session-type? I'm still contemplating Roly's earlier comment) would require we adjust channel types based on prior messages sent.

Determinism can be bad

I think it is mistake to assume determinism is a desirable property, that is, desirable because it makes reasoning easier. On the contrary, I would argue indeterminism makes reasoning easier. (Devil's Advocate for Good Science..}

Consider a routine that has two channels A and B, and the data comes on channel A first, then on channel B. The routine has to read A first, then B. Its puts the A data in variable a, and the B data in variable b. Then we do something with the data. Is the routine correct? You have to know the order the data is coming in. You have to be able to determine the reads are in the right order. When the data is used, you have to verify that the a data is used where you intended to use the first lot of data that came in. The reasoning is quite hard!

Now consider a routine that reads from either A or B, and we don't know which data comes on which channel, nor which data comes first. So you have to read A or B together to get some data, and do it twice. Since we can't distinguish the data by time or by which channel it came in on, we can't get the read order wrong, we can only forget to read exactly twice. And since we can't tell the difference between the two pieces of data in a calculation, we only have to make sure to use both where they're distinguished, we don't care which one is first in the usage expression.

Clearly, the nondeterministic case is *easier* to reason about. Knowing less is a vital part of all programming. Its called *abstraction*. The less you need to know the better. The less you know, the harder it is to mess up. Since nondeterminism reduces knowledge, the more of it you can get away with the better. Its a common mistake made by programmers that reasoning is easier if everything is determinate. One should always be asking: how can I get away with the least possible information?

The flip side of the coin: nondeterminism is *faster*. Why? Because it gives compilers more optimisation choices.

Let me give a horrific example. Python. In Python, the result of everything is determinate. It is almost impossible to optimise. In particular, there is no such thing as an error in Python. Not even a syntax error. Yes, that's right. You can import entirely the wrong file, and you are sure to get a SyntaxError exception. And you're allowed to catch it. And do something. So it isn't an error. So it is quite hard to meaningfully compile a Python program because you cannot even report a SyntaxError and terminate compilation, because there's no such thing. You actually HAVE to generate code that throws a SyntaxError at run time. Determinism gone mad.

You're not comparing

You're not comparing equivalent problems. This isn't contrasting verification of a non-deterministic vs. deterministic protocol, it's contrasting the verification of an ordered protocol vs. an unordered protocol.

A proper comparison is a program where you have to forward messages in a specific order, where they arrive on a single channel that is:

1. deterministic: message A always arrives first, followed by message B (deterministic).
2. indeterministic: messages A and B can arrive in any order.

Clearly (2) has more cases to analyze, and hence is harder to verify to ensure A and B are forwarded in the correct order. I think determinism always makes reasoning easier.

Re: determinism

Determinism isn't just good for reasoning. Determinism is also good for testing, replay, scientific reproducibility, caching, consistent sharing of computed views and procedurally generated data, omniscient debugging, and robust computations via redundant evaluation on multiple hardware devices (in cases where we can't trust the platform due to cosmic rays or nefarious agents).

Determinism doesn't imply strong ordering constraints. Irrelevant ordering information can be eliminated by sorted collections, especially those like a sorted list or trie where even balance is independent of access and update order. Evaluation of KPNs does not depend on the order in which messages on different channels, and so an accelerated evaluation function can leverage arrival order to improve resource utilization while still producing observably deterministic results.

While nondeterminism reduces knowledge, it does so in a manner that requires your code to be correct in exponentially more cases. Including cases you haven't tested but that will arise when you use your code in a new context where scheduling decisions have a different probability distribution. (Which in my experience guaranteed every time you change hardware or clients or optimizer/compiler versions or optimization flags.)

In theory, nondeterminism allows more optimizations. I have not experienced this in practice. What happens instead is a lot of extra, difficult-to-optimize barriers to constrain nondeterminism, and a lot of distrust in the codebase from programmers who keep getting new regressions when code is ported or moved, and blindly hack at code until they go away.

The Python example, I'd attribute more to side effects and lack of effective support for static partial evaluation.


Yet Python is a good language to program in, programmers are productive, and the code is readable. With things like "numpy" it can be all the above and also be fast.