Session Types for Purely Functional Process Networks

Session types greatly augment purely functional programming. Session types enable pure functions to directly model process networks and effects.

We can adapt session types to pure functions by first reorganizing function calls of form `(A, B, C) -> (X, Y)` to a form that uses distinct, labeled input and output parameters, like `fn(in A, in B, in C, out X, out Y)` which we can easily rewrite to a sequential session type `?A ?B ?C !X !Y`. I'm assuming the parameters are easily distinguished, either by distinct data type or by augmenting with named parameters (such as `a:int` vs `b:int`).

Sequential session types conveniently represent that intermediate outputs are available before all inputs are provided. A simple reordering to `?A !X ?B ?C !Y` would correspond to a conventional function type `A -> (X, (B, C) -> Y)`. However, in contrast with the conventional type, the session type is recognizably a subtype of `?A ?B ?C !X !Y` or even of `!X ?A !Y ?B !D`. Thus, we can eliminate most adapter code to relax ordering in the caller. Further, a functional language designed for session types can easily support a direct programming style, e.g. based on imperative reads and writes to single-assignment parameters, thus avoiding the noise of continuation-passing style.

Sequential sessions already demonstrate a trivial model of interaction: the caller can observe the intermediate output `X` before computing inputs `B` and `C`. We can also model 'plain old structured data' types as unidirectional sessions, e.g. a type `!x:int !y:int !z:int` is essentially a record value with labels x, y, z.

Session type systems usually also support choice and recursion.

We can adapt 'choice' to pure functions by assigning a choice-label to a choice parameter and this label determines which subset of choice-specific parameters we'll use. A simple example:

type IF = &{ add: ?x:int ?y:int !r:int
           | negate: ?x:int !r:int
           }

With this definition in scope, the session type `?method:IF` could represent an external choice of 'method'. The choice-label `add` or `negate` might be assigned to implicit parameter `method.case`, and the label chosen will determine whether we further use parameters `in method.add.x : int`, `in method.add.y : int`, and `out method.add.r : int` or `in method.negate.x : int` and `out method.negate.r : int`. This is an exclusive choice, so a compiler could safely 'overlap' memory for these five parameters, similar to a C union. But unlike a conventional union or variant, the choice determines both inputs and outputs. Choice session types can conveniently model object-oriented interfaces or singular request-response interactions.

Aside: Session type systems distinguish external choice (&) vs internal choice (⊕). In the adaptation to functional programming, whether a choice is external or internal is based on whether the 'choice parameter' like 'method' in is input or output. However, it's convenient to represent some choices from the 'external choice' perspective. Thus, use of `&` above allows the type to be syntactic sugar for `{ add: !x:int !y:int ?r:int | negate: !x:int ?r:int }`. When we later provide this type as input, via `?method:IF`, the label is input and all the `!` and `?` types are flipped.

Recursive session types can further augment our functions with unbounded trees or streams of interactions. Conceptually, they allow functions to have an unbounded set of parameters, each with a unique 'path' name. A demand-driven stream type might have a form: `type Stream x = &{quit | more: !hd:x ?tl:Stream x}`. Whether a demand-driven stream has more elements is chosen by the receiver, not the sender. (Session types can also model normal streams, push-back streams, and others.) In context of recursion, our pure function logically has parameters of form `more.tl.more.tl.more.hd`, where `(more.tl)*` may recur in the parameter name an arbitrary number of times.

Use of recursive session types is similar to conventional functional programming with tree-structured data. A compiler or garbage collector can recycle memory for parameters that become irrelevant to further computation. Session types can represent many useful evaluation strategies such as call-by-need or bounded-buffer pushback. Intriguingly, session types can model 'algebraic effects' via recursive streams of request-response choice sessions.

Beyond sequencing, choice, and recursion, we can also extend functional programming with 'concurrent' sessions to represent partitioned data dependencies. For example, with function type `(A,B,C) -> (X,Y,Z)` it's possible to have a data dependency graph of form `(A,B) -> X; (A,C) -> Y; (B,C) -> Z`. It can be convenient to represent this precise data dependency graph in our session type. Fortunately, it's a simple extension to add concurrent types (though concise description, avoiding redundant expression of dependencies like `A`, is non-trivial).

Session types give us a rich model for interaction with pure function calls.

Implicitly, these interactions are between the 'call' and 'caller'. Fortunately, it is not difficult for a session-typed functional programming language to support 'delegation' such that we tunnel handling of interactions to another function call. When we begin to delegate long-lived sessions (e.g. recursive streams) between functions, the program begins to take a form of a 'process network' where pure functions are the processes and delegation models the wiring between them. Use of session types and delegation for purely functional process networks will subsume Kahn Process Networks (KPNs), which are limited to simple streams as the only interaction between processes. With session types, we can effectively model processes that rendezvous, coroutines, processes that have clear bounds on input and output, clear termination behavior.

As a summary, session types for purely functional programming supports:

  • a more convenient alternative to continuation passing style
  • function types able to directly represent object-oriented interfaces
  • a surgically precise alternative to 'call-by-need' vs 'call-by-value'
  • streaming request-response interactions for rendezvous or effects
  • process networks that tunnel interactive sessions between function calls
  • process models and interactions more flexible than Kahn Process Networks
  • opportunity to fuse loops and optimize dataflow within the network
  • type safety, subtyping, and progress guarantees for all of the above

Session types greatly improve this does not compromise functional abstraction or functional purity, except insofar as unbounded interactions with functions are not what we usually imagine from the mathematical connotations of 'function'.

I have not searched hard for prior art on the subject of session types exposing partial evaluation of pure functions as a basis for interaction and deterministic concurrency. I would not be surprised to discover all this is known in obscure corners of academia. But to me, who has recently 'discovered' this combination, this seems like one of those 'obvious in hindsight' features with an enormous return on investment, which all new functional programming languages should be seriously pursuing.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Syndicate

I'd be interested to hear how this compares to Syndicate, cf. From Events To Reactions: A Progress Report (slides). There are some interesting examples in there about managing conversational state.

re Syndicate

Syndicate seems to be a language designed around a blackboard system pattern. Processes communicate via shared data bus or tuple space.

This is not a bad pattern, but it does have a flaw that is significant to some use cases: the shared data bus easily becomes a bottleneck for routing and communication. This is especially the case for deterministic models, where we cannot logically observe arrival-order non-determinism, so a fast process will often wait on the slowest within each step.

Kahn process networks (KPNs) are a deterministic concurrency model with point-to-point communication. Some processes in the network may be waiting, and we could certainly model bottleneck patterns like a data bus as a process within a KPN. But there are no intrinsic bottlenecks, so a KPN computation can easily be designed for large scale distributed systems. But KPNs have their own flaws: there is no clear termination behavior, it's awkward to integrate new processes at runtime, it's difficult to model request-response interactions where the response type should be tightly associated with the specific request.

What I described in the OP is a way to directly treat pure function calls as processes, by carefully exposing partial evaluation. The 'careful' here is via session types. Point-to-point communication is supported by tunneling interactions between function calls. The resulting system is closer in nature to KPNs than to Syndicate. However, it also avoids the flaws of KPNs because functions have well-understood termination behavior and are easily shared as first-class values.

From another perspective, what I've described is an augmentation to purely functional programming that exposes a more scalable and expressive basis for parallelism than the simplistic fork-join parallelism we can get from 'par' and 'seq' annotations, and a more expressive basis for pipelining than transducers. Partial evaluation is a basis for concurrency, and importantly is a basis that does not require layering artificial patterns onto the language, just carefully erasing some artificial modularity barriers that we've never thought to question.

State management

How do purely functional networks handle stateful resources and processes (e.g. a REPL)?

Stateful Loops

State is intrinsic to loops that 'fold' over a history. There are many patterns of this nature in conventional functional programming. The difference with session types is that we can explicitly and precisely model an 'interactive' history, e.g. streams of request-response rendezvous sessions, or bounded-buffer data channels with pushback, instead of (for example) the mysteriously maintained lazy input streams of FRP.

Session types allow bi-directional tunneling, but don't change how state is essentially represented.

Because interactions are precisely modeled, it's also feasible to distribute stateful loops across physical processors and compile them so communications are routed between remote processors instead of through a central coordinator. This is painful to support in conventional FP languages, and essentially was the motive for Kahn Process Networks. So, from a performance perspective, we're also gaining significant opportunities in contrast to stateful patterns with conventional functional programming.

Session Types & ABCD


[A] [B] input   = [[A] B]
[A [B]] output  = [A] [B]
[A] [B] compose = [A B]
    [A] copy    = [A] [A]
    [A] drop    =
[A] [B] swap    = [B] [A]

Are `input` and `output` enough to express "sequential interaction"? One of the example types you gave, `?A !X ?B ?C !Y`, reminds me of putting values in to a block and taking them out with those two primitives, but maybe there's more to it than that.

I'm also not sure how the "choice" you've described here works, maybe it requires a new primitive?

re abcd

Reading your title before your comment, I was wondering if you were referring me to A Basis for Concurrency and Distribution, an excellent resource on session types. Instead, you're referring to a stack based rewriting language. ?

A stack language doesn't usually have an "output" primitive, because it requires deep knowledge of implementation details. Essentially, that "output" primitive is something that sequential session types would make typesafe and usable.

However, in my OP I mentioned 'distinct' types in the adaptation to pure functions, such that '!A !B' and '!B !A' should be equivalent. The sequential `order` among multiple output parameters (or multiple inputs) is not something we observe. Sequencing is only relevant where some elements are inputs and others are outputs. This commutativity is convenient, enabling functions to be used in more contexts without tons of adapter code. Your example implementation seems to be missing out.

For languages where types cannot easily be distinguished, we could use labels (modeling named or 'keyword' parameters) instead. In your case, you could feasibly support limited commutativity by adding labeled data primitives like ':a' such that `[A] :a [B] :b == [B] :b [A] :a` (for distinct labels a,b) and perhaps `:a .a == ` and similar rules.

However, you'll quickly encounter a challenge intrinsic to stack-based languages: it's difficult to syntactically predict or control the 'arity' of a computation. For example, consider the program `[[B] :b Computation :a]`. Is `[B] :b` consumed by the computation? Is it shadowed by another `[B'] :b` forming an implicit `:b` stack? Should it be immediately available as output? A programmer cannot easily determine dataflow based on syntax; instead, we need very deep knowledge of the computation - its arity, which labels it observes, etc.. This is not a desirable property.

Although I'm fond of stack-based languages, this isn't a problem I've been able to solve. It's a non-issue if we aren't working with intermediate outputs, but a stack programming syntax seems very awkward with fine-grained dataflows. I believe that any 'tacit' syntax that enables programmers to predict and control dataflow would be a detailed instance of a dataflow graph. And graphs can be awkward to represent in textual languages.

Regarding 'choice': choice is essentially a restricted form of dependent types. Full dependent types in a stack language are feasible: we enable the stack type to depend on a value at the top of the stack. To restrict this similarly to choice types, we might limit the stack values we may depend upon to specialized enumeration or label types (as opposed to arbitrary integers, strings, etc.).

However, it would be relatively awkward to work with choice types via explicit dependent types on a stack in the above described manner. Session types are relatively fluid. For example, if we start with `!X ?A !Y`, then supply input for `A`, our remaining type is `!X !Y`. We really need to track fine-grained 'type state', not just a coarse grained 'choice type'. For stack programming, we might want to distinguish a type-state stack from the data-stack, with the type-state stack only being observed or manipulated by special operators like 'case()' switches or variant decisions.