On compositionality

Jules Hedges has written a thought-provoking blog post, On compositionality where he connects the familiar idea of compositionality to the idea of emergent effects in nature, where systems can be understood as either having compositional properties or emergent properties.

The key point about emergent systems is that they are hard to understand, and this is as true for engineering as it is for science. He goes on to say "As a final thought, I claim that compositionality is extremely delicate, and that it is so powerful that it is worth going to extreme lengths to achieve it", so that avoiding emergent effects is a characteristic of good programming language design.

Some thoughts:

  1. His examples of emergent systems are biology and game theory from an economic perspective. I would add to this list physics: of his co-authored paper showing that the spectral gap is undecidable, David Pérez-García said "For example, our results show that adding even a single particle to a lump of matter, however large, could in principle dramatically change its properties."
  2. Spolsky's famous characterisation of interfaces built on shaky foundations as Leaky abstractions to me makes the distinction between compositional and emergent systems a little less than sharp.
  3. We could talk endlessly about the list of what he regards as compositionality-breaking features of PLs. The evils of global state are well-documented, but I find dmbarbour's argument that Local state is poison a very interesting alternative way to look at what properties do we want from code; more generally, what kind of compositionalty PLs offer is very much paradigm dependent. Gotos are considered harmful, but the Linux kernel has little trouble with longjmp because of its mandated coding style: compositionality in engineering is a not just a matter of semantics but also of use. He targets OO and Haskell's type classes - I think he is quite correct - note that within these paradigms one can regain compositionality by restricting to LSP or algebraic classes, and supporting his thesis we see that these ideas have spawned ideas for the design of new, cleaner PLs.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

"Local state is poison" link

"Local state is poison" link is invalid.



Leaky abstractions, like the

Leaky abstractions, like the Chomsky hierarchy, came to computer science from linguistics. From Edward Sapir's book Language (1921, p. 39):

The fact of grammar, a universal trait of language, is simply a generalized expression of the feeling that analogous concerpts and relations are most conveniently symbolized in analogous forms. Were a language ever completely "grammatical," it would be a perfect engine of conceptual expression. Unfortunately, or luckily, no language is tyrannically consistent. All grammars leak.

I wonder if that is the same

I wonder if that is the same kind of leakiness. The class of grammars is cleanly partitioned by the Chomsky Hierarchy. Internally any given grammar is a ball of spagetti with no clear abstraction between its constituent rules, but do we really use grammars as composition blocks where the internal details leak across the abstraction boundary? The example that springs to mind of composition of grammars is the classic lexer/parser split where the boundary does not leak if the language is context-free, although it does if the parser has context-sensitive behaviour (i.e. the lexer hack). I suspect this is really an issue because the regular grammar does not sit naturally inside the context-sensitive grammar as we are missing a more natural way of the context-sensitive grammar itself. I thought it was conventionally agreed that “grammars do not compose” - are there interesting cases where they do and the leakiness across the interface between them becomes an issue?
[ edit: wow autocorrect does some horrible things ]

Common cause

I submit both forms of leak are caused by the mismatch between sapient human minds and non-sapient formal structures. Mathematics is (as I've remarked in another thread) a conversation between human minds, and its abstractions are sapience-shaped; programming abstractions break down because they are, in essence, attempting to "explain" abstractions to the non-sapient computer. Natural language is —not going out on a limb, here— also a conversation between human minds; hence formal grammars break down because they are trying to contain sapient dialog within a rigid structure. Sapience doesn't fit into the confines of a formal system (I've blogged relating this to Gödel's Theorems), so both programming abstractions and formal grammars "leak".

[edit: fixed blog link]

Grammars do compose

Grammars compose if you do them the right way. It's like saying types don't compose.

The way to make grammars compose is the same as types .. because grammars ARE types, just more powerful ones than the weak types in most PLs.

The technique should be well known to readers here.

Open Recursion.

Grammar composition principles

The recursion is not the best way to compose grammars IMHO.

There are two better partial samples of grammar composition:

  1. Free operator definition in Prolog (and some other languages)
  2. Macros in Lisp

I have tried to extend the first approach to generic grammars long time ago. Some design decisions there are obsolete (particularly with respect of keywords), and I've changed them in more recent unreleased version, but now I'm focused on other project. But I still believe that such approach is the better approach to grammar extension and composability. At least it defines grammar along natural line of thinking: contexts, statements, operators, literals, while traditional grammar definition languages forces language designers to complier these concepts to grammar productios. Such translation should be done by parser generator rather than by human.

Prolog operators

"Such translation should be done by parser generator rather than by human."

The parser generator for a sufficiently advanced language is maybe 1% of the work of compilation; it really isn't worth asking a tool to take that 1% off of your plate, because there's a lot (of great value) that you can learn and discover by actually writing the parser by hand. (I have regretted the few times that I let a tool steal this fun and educational work away from me.)

For basic languages, or for disposable grammars, I would guess that a parser generator would have a far greater positive impact (percentage-wise). OTOH, the absolute costs are low enough in those instances as to be ignored.

I am very curious, though, what specific aspects you like about the free operator definition in Prolog. What makes it (to you) such a compelling way to compose a grammar?

Beauty of prolog operators

Prolog operators are buitiful from modularity point and locality point of view. There is still problem of global state and lack of visibility rules, but for old interpreted language this is understandable.

Let's consider two modules. One introduce operator `+` and other `*`.

The developers do not have to think about each other in prolog. Their operator definition are independent. They only define :-op(+, 500, yfx) and :-op(*, 300, yfx) in the source code. And it is possible to any graphics symbol.

The total language will be updated, so these operators can be used now. There no prior agreement on operators required, provided that they do not conflict.

In usuall grammar productions, if one is adding operator, the entire grammar has to be updated, because production with lesser priority should be aware about next priority layer. The grammar has to be global and updated each time we have to introduce something.

The another thing is that intention of the developer and the text in code matches each other. The developer want to introduce operator rather than write update grammar production. The operator syntax is practially shortest way to get things.

And you are considering closed programming languages. Think DSLs and programming language extensions. For DSLs or langauge extensions, we might need to write productions that mutually recursive on production level with base language. But there
is no such problem if we use statements and operators as language definition elements.
Only some limited extensions like this could be implemented w/o introducing new language elements. Closeness of programming languages is the current roadblock on language innovation. For example, Java users had to wait for long time time, before they have received enums and for(i : iterable) operators in Java 5. Such language extensions should have been providable by the language plugin.

Might not be that simple?

I'll need to give this some thought. Unary and binary operators are quite simple (parsing, associativity, precedence) compared to the example you provided of "for(i : iterable)".

Think DSLs and programming language extensions.

Yes, I can clearly see the benefit in those cases. Entropy is a far greater devil in a language used more widely.

ETL is example

Yes. It is more complex, but the same composition principles could be still implemented.

See ETL examples and the rest of the documentation. The version 0.2.x had some design mistakes (in the area of handling of keywords and numeric literals) that were fixed in unreleased version 0.3.0 (availalbe on github, lacks updated documentation for now). As for note, the grammar language is defined using the grammar language.

Nicely done

I can see that you put great care into that project. It will take me some time to digest that, though!

What I've found is that any complex constructs tend to give grammar-driven parsers a very hard time, and often it is those very same constructs that tend to be most "useful" to the person using the language. Just to pick an example at random (that happens to be fresh in my memory), anything that isn't LALR(1) and may need to backtrack in any significant manner:

     * Parse the multi-form of the VariableInitializer construct, or nothing.
     * @return null or a MultipleLValueStatement
    MultipleLValueStatement peekMultiVariableInitializer()
        if (peek().getId() != Id.L_PAREN)
            return null;

        Mark mark = mark();

        List listLVals = new ArrayList<>();
        while (true)
            boolean fFirst = listLVals.isEmpty();
            if (peek().getId() == Id.VAR || peek().getId() == Id.VAL)
                // this has to be a multi VariableInitializer
                listLVals.add(new VariableDeclarationStatement(new VariableTypeExpression(current()), expect(Id.IDENTIFIER), false));
                // assuming that we haven't already started building a list of declarations,
                // encountering an expression followed by anything other than an identifier (for
                // a declaration) or a comma indicates that we're going down the wrong path
                Expression expr = parseExpression();

                // next token   meaning
                // ----------   ----------------------------------------
                // COMMA        the expression must be an LValue (commit)
                // IDENTIFIER   expression was the type portion of a declaration (commit)
                // R_PAREN      list is empty    : oops, it's not a multi VariableInitializer
                //              list is NOT empty: done parsing the multi VariableInitializer
                // otherwise    list is empty    : oops, it's not a multi VariableInitializer
                //              list is NOT empty: it's an error
                if (peek().getId() == Id.IDENTIFIER)
                    // there is a variable declaration to use as an L-Value
                    listLVals.add(new VariableDeclarationStatement(expr.toTypeExpression(), expect(Id.IDENTIFIER), false));
                else if (fFirst && peek().getId() != Id.COMMA)
                    return null;
                    // the expression has to be the L-Value
                    if (!expr.isLValueSyntax())
                        log(Severity.ERROR, NOT_ASSIGNABLE, expr.getStartPosition(), expr.getEndPosition());

            if (!fFirst && match(Id.R_PAREN) != null)
                return new MultipleLValueStatement(listLVals);



ETL compilies grammar to LL(*)

ETL compilies grammar to LL(*) on expressions and statement annotations. LL(1) in all other places. The documentation incorrectly says LL(1) because it acutually look ahead only for only one token during parser, but the parser keeps stack of states internally and output queue for term parser uses marks and structural token insertions to emulate different AST building operations for yf* cases.

The good thing is that this is done internally by grammar interpreter, so user of the langauge does not have to bother with it. Grammar compilation is done in the runtime. It is theoretically possible to cache compilation results on disk using Java serialization, or compile grammar directly to Java code, but I currently do not see need for it, the compilation of grammars to runtime structures is relatively fast and in 0.3.0 even asynchronous.

And the grammar compiler just fails on conflicts.

There is also error recovery build in the parser that rely on common statement/block language layer. The 0.2.0 supports only simple "end-of-statement" recovery. The 0.3.0 support more complex error recovery strategy that recovers to the next to keyword or block start.

P.S.: There is error recovery sample in the tutorial.

Grammars do compose

Duplicate, slow server .. sorry.

[ deleted ]

[ deleted, hit the wrong reply link ]

On "local state is poison"

On "local state is poison". I completely disagree.

Local state is a way of thinking locally about the things. It is a tool of complexity management.

The real poisonous component is unification of entities of different life time. There is a persistent state and ephemeral state in the program. The first need live upgrade, and so on. The second can die any time, and its task to affect persistent state.

There is really no way to consistently upgrade information in live process. But lets follow what patterns are used in real enterprise applications. We could argue if they best or not, but they follow the track of managing this problem. There are regularly attempts to go from that pattern (recent one is the approach OODB that has died almost completely), but they still returned to traditional path (in form of Graph databases).

Firstly, there are clear separation in persistent state and ephemeral. The ephemeral state is not reliable and it is lost on every application restart. It is designed to die and is not explicitly managed and often implicitly typed. The persistent state is usually explicitly designed and typed.

Secondly, there are usually clear code boundaries between persistent and ephemeral changes. The change is usually done in transactions that reduce risks of inconsistent changes in persistent state.

Thirdly, the persistent state has data definition language that assumes live patch model (alter table, create table, etc). The ephemeral data definition language is complete type model.

Yes. These languages are completely different because of different conditions. But we could see the same level of the abstract chain in the both that corresponds to cognitive operations groups defined by J. Piaget.

Congintive StageOrganizational PrinciplePrograming LangaugeDatabase language
Pre-operationalSingle-level patterns (sequences, labels, global scope)Fortran 66, Assembler, BASIC (classic)CODASIL, SQL
Concrete OperationsHierarchies in code and dataC, PascalGraph databases
Formal OperationsFirst-class meta-constructs (functional programming, generics, contracts)OOP (Java, C#), FP(Haskell)None

I guess the line will follow up. There will be true OO databases in the future (OODB was false name, there were almost no OO), with garbate collection, generics, encapsulation, and other things.

If you could see, each stage adds new construct that organize constructs from previous stage. For example, patterns (sequence of actions) organize names (actions), hierarchies (blocks) organize patterns (sequences), meta-structurues (classes) organize hierarchies (procedures and data structures).

According to M. L. Commons the next abstraction stage is system thinkings. So the true next generation programming langauge will introduce some concept of system that will organize classes. And if we would look at the current state of big systems, there are numerous efforst of defining DSLs for organizing systems. In Java world there is Spring that is a language for defining a system of connected components, in other lanagues there are also numerous dependency injection frameworks. There are patterns for cloud applications like providing emergent system properties like reliability over unreliable hardware (which is also an effect of system thinking).

The first need live upgrade,

The first need live upgrade, and so on. The second can die any time, and its task to affect persistent state.

The point that David was making in that article is that the ephemeral state often can't "die at any time". There can be any number of ongoing activities, and abandoning any of them to force an upgrade could leave the system in an inconsistent state, either internally or with external resources.

If the program is required to deal only with persistent state on the other hand, then that enforces an additional constraint that every state transition be well-defined, recoverable (and probably bounded).

Cost tradeoff

My point that it should be designed to be able to die at any time and persitent state should be designed recoverable. If it cannot die at any time, this is a design bug. Hardware fails. And even worse, sometimes hardware fails suddenly.

Persistent state is expensive from resource standpoint for a reason, the additional resources are spent for additional property. And persistent state require more work from developer to plan ahead (per storage element like field or table). Upgradability is not free. It causes additional work from programmer point of view.

On other hand, ephemeral is cheap from resource standpoint and and requires less devloper time. Upgrade is just regual wipe of data. We do to have to pay for upgradeability.

This is trade-off. In ideal word we would have both properties: cheap, durable, and upgradeable (I guess we could select at most two properties here). However I do not see how we could remove this trade off.

Making all state global would not change anything. If anything, we will go to the days of Fortran 66, where we had to implement recursion using arrays that emulate call stack. So we would be able to allow programs with lesser behavoiur complexity.

In the article "local state is poison" author just suggest everyone to pay the cost by moving everying to persistent state, whether one needs it or not.

In the article "local state

In the article "local state is poison" author just suggest everyone to pay the cost by moving everying to persistent state, whether one needs it or not.

No, this is a strawman. Your argument is effectively the same as the argument made in that article, you just have a different understanding of what constitutes "local state". Rather than debate the minutae, here's a summary of the argument:

A program is a set of purely functional transitions on persistent state. Any transitions that access external resources must be lifted to persistent states. Thus, any computation can be moved, killed/resumed, or analyzed at any point. This is exactly what the article argues for, and you seem to largely agree.

IO is the purpose of program, and it is essentially ephemeral

A program is a set of purely functional transitions on persistent state. Any transitions that access external resources must be lifted to persistent states.

Disagree. This is narrow niche of practical situations (batch jobs over database) or it just moving problem to intermediates.

Thus, any computation can be moved, killed/resumed, or analyzed at any point.

Agree. But not "thus", as previous statements specify a niche solution (for example, batch jobs). The part "or analyzed at any point" depends on interpretation, but generally might not be possible.

This is indeed a difference between understanding of local state. I just do not believe in "purely-functional" limit in the real word. The real word applications interact with environment, and that environment interactions could not be captured as purely functional in normal situation. There is always place for failure (CAP-theorem as example). The application still should be ready for it.

For example, calling web service is still local state, until we save it to database. The program should be designed to be ready to fail on start of the call, during the call, and after the call. The call to the service should have a meaningful recovery policy. Such policy is not necessarily automatic. There might be detect and update SQL scripts launched manually, or just sum reserved for payment to lawyers that will discuss lost transactions in courts (I've seen both such manual solutions in my practice).

There is no way to provide absolute reliability. The programs and hardware will always have a chance to fail and it is not possible to reliably maintain invariants and avoid data loss in distributed systems (CAP-theorem). The question is how it is damage control is implemented. Lawyers could be cheaper than maintenance of extra persistence layer that could fail anyway.

"Local state" article argues that ephemeral state is reducible to purely functional state. It is not. Part of it could be reduced to it, but IO could not be completely abstracted. At best, it could be moved to intermediates. And doing IO is the primary purpose of most of practical applications (income). Everything else is the loss of resources (cost). The ideal program will keep the same IO behavior, while having zero resouce computations. Even compiler purpose is reading sources and writing good object code correctly, everything else contributes to global warming, electricity bills, and to hardware amortized cost.

"Local state" article argues

"Local state" article argues that ephemeral state is reducible to purely functional state. It is not. Part of it could be reduced to it, but IO could not be completely abstracted.

"Completely abstracted" is undefined. Monads completely capture side-effects in a purely functional fashion, so in principle, your claim appears false.

I can't conceive of a single program that can't be represented with purely functional transitions between stable persistent states, so if you have a real example, please share.

IO modads define DSL isomorphic to imperative programmin

I have limited exposure to Haskell as I did not liked its closed-world principle and lack of open types. However I guess I figured out IO monads.

If we skip long reasoning on pure functional theory under them, the IO monads are the way to implement imperative DSL inside functional code.

main = do  
    putStrLn "Hello, what's your name?"  
    name <- getLine  
    putStrLn ("Hey " ++ name ++ ", you rock!")  

There would have been no difference if we wold have just embedded functional sublanguage into imperative language like it is done in Scala or Kotlin. IO monads is just more complex way to do the same using some purist means. We translate imperative code to some functional abstractions then translate it back to imperative control flow (assembler and system calls). We could just as well to skip that "pure-functional" step.

Thus IO monads is just the failed attempt to escape imperative code when working with IO. There is no escape from the submarine, promises provide the way better means to organize IO code then IO monads using hybrid imperative-functional means as there is no loss for concept translation.


Promises are a monad, with unit = return and bind = then. They are used to implement an asynchronous imperative layer over an imperative layer.

Promise is aproximately Async monad

Promise is aproximately equal to Async monad in Haskell by functionality. But they are not necessary implemnted like that.

Most promises (that I have seen and created) have two fundamental interfaces:

  • Add listener.
  • Resolve promise with success or failure.

So it is more like a single element memorizing event channel.

The flatMap operation on promises imperative channel is usually utility method that works over these two fundamental operations. Saying that this is monad is somewhat misleading, it is also possible to say that it is the object.

Async is a bit non-typical monad in Haskell. IO, List, STM monads provide total order of operations, and Async provide partial order. It is the only partial order monad in Haskell that I know of. And I'm not sure if it does not invalidate some assumptions about monads. Total order is important feature of other monads.

Implementation not important

How monads are implemented is not important, they are an interface. In JavaScript:

(new Promise((unit, err) => {
   setTimeout(() => {unit(1)}, 1000)
})).then((x) => {
   return new Promise((unit, err) => {
     setTimeout(() => {unit(x + 1)}, 1000)
}).then((x) => {

Consider the type of 'then' it accepts a function from a value and returns a promise, in Typescript:

then(f(x : any) : Promise<any>) : Promise<any>

Which would be itself on a promise object. Rewriting this as a Haskell type would look like this:

then :: Promise a -> (a -> Promise b) -> Promise b

Which clearly fits the type of bind : m a -> (a -> m b) -> m b, with m = Promise
Then unit is Promise.resolve, which would have type: 'a -> Promise a' which fits the type for unit 'a -> m a' with m = Promise as well.

Should be pretty easy to show it follows the monad laws too.

Except that JavaScript Promise is not a monad

See "No, Promise is not a monad" for an analysis.

Sort of

The argument appears to be that left-identity and associativity do not hold for every possible function. Therefore it does hold for some variations. In the end JavaScript is not strictly typed, so it cannot exclude these other variants. However the above criticism is probably valid for _all_ attempts to define a monad in JavaScript. In Typescript we could define a type signature for a monad, and a Promise would comply, it would just exclude the non-compliant use cases.

I guess this comes down to the distinction between some promises are monads, and all promises are monads.

Do you lose any useful functionality by excluding the non-monadic promise variants?

DSL argument

The operations you have mentioned are read operations on the promise, but promise (JavaScript or otherwise) has write operation as well (for javascript, these two operations are passed to function with which promise is created). And thse write operations is not pure, and they are non-deterministic, and order dependent. They cause non-local change to computuation process.

This is at least an extension to typical Monad protocol, which is focused on reading.
And this write protocol is essential feature of promise, discussing the Promise w/o acknoledging it is the same as just saying that promise is an object. Yes it is an object as well, but it is more than it. We can use promises well and express the data dependencies with them exactly because the promises are mutable objects.

But I still see no counterargument to the of my reply that started discussion about promises.

The thing is that whether with promises or not. Haskell program starts with imperative DSL language with main with IO monad, then it drops the IO monad somewhere to do purely functional computuations. And it is only possible to work with persistent storage using this imperative DSL language.

This imperative DSL language has exactly the same problems as other imperative langauges (order dependency, ability to bring storage into inconsistent state, and so on). Programming with this imperative DSL does not give any advantages over using normal imperative language (from reliablity and interruptablity point of view), but gives a lot of disadvantages as it is too cumbersome to use IMHO when one needs things done.

In Kotlin or Scala, we as well start with imperative langauge, but we could use functional features as needed. This is the same as with haskell, but without akwardness of saying to oneself that we are wrting purely functional code purely for "side effects" it produces. We just honestly doing imperative part of the programm, i.e. primary and intended effects rather than shyly calling them "side effects".

These are "side effects" only if we conside the object methods "functions", because they somewhat looks like it. They are not functions. They are processes with parameters. Pure functions are just well isolated processes that do not have other interaction with world except to accepting parameters and producing outcome (and this is if ignore memory load and CPU load that could affect timings and success of other computations). This is important case, but it is just a case that supports some interesting optimizations. But the truth is that the algrebra of processes is different from the algebra of the functions despite of the some similarities. Specifically because processes are interacting with the environement.

Promises, Promises

See Andreas' comment below, the JS promise could have been defined to be a monad (if it did not recursively unwrap I think), so "Promises" in general definitely can be a monad, it's just that the JS one misses by a small margin. If the committee could have accepted a compliant definition, and it would make explaining this a lot easier :-)

Monadic behavior is not a complete promise API

My argument is orthogonal to JS peculiarities. My argument is that a promise is a mutable object (stateful channel) with both read and write operations. Monad-like is behavior is just one of the facets (i.e. read API) and it is incomplete at that.

Every promise I know has method that change its state to some value or error if it was unresolved before it. This is unavoidable, because of interaction with external system. Something like:

setResult(p :  Promise a, value : a) : Unit
setFailure(p : Promise a,  failure : Error) : Unit

These methods return nothing, order sensitive, are called only for side effect.

It could be done in different ways.

  • CompletableFuture in Java directly invokes methods
  • JS passes these two functions to method body passed as constructor parameter when promise is created
  • E programming language and AsyncFlows use resolver interface

But this is radically different from IO monad, where there is no change state, but only read operations. Promise behaves like a monad only after it was completed with success or failure. Until that time promise looks like completely different animal. For example:

function aNever() { return new Promise(function (a,b) {})} 

Returns a promise that never completes.

Since promises are order non-deterministic and order dependent, I doubt that any process that uses could be called functional. This is different from IO monad that has value or failure when it is available. The describing promise w/o these resolver operations is a bit strange since these operations cause non-deterministic and non-pure functional behavior and are essential parts of promise API. The other essential set of operators is listening to state changes, as it is essential writing coordination code like loops. So minimal complete promise API is the following:

createPromise() : Promise a
setResult(p :  Promise a, value : a) : Unit
setFailure(p : Promise a,  failure : Error) : Unit
onFailure(p : Promise a,  handler : Error -&gt Unit) : Unit
onSuccess(p : Promise a,  handler : a -&gt Unit) : Unit

There is often separation between listening and resolution aspect (E, JS, AsyncFlows), but sometimes they are lumped together like in Java's CompletableFuture.

All other is just usability extra. Creating already failed or already resolved promise is also an optimization over this set of operations.

On other hand, operations like map, flatMap, and then are accidental parts of promise API that provide some utility read API for promise using listeners. And these could have been provided outside of promise by some operators. For example, ANY or ALL operators are better looking when it is defined outside of promise as operator over closures that return promises (like this) rather than as promise method.

Saying that Promise just a monad does not give it justice. The monad-like behavior is just one of possible read API for the promise, so it is only one of facets of promise and incomplete at that as it does not include listen operation that does not return a promise.

When we are using promises and connect them using `pure functions`, we get small determistic islands in the sea of generally non-deterministic code. Any promise represents unknown result, otherwise we would not have used it. Saying that such code is purely functional, restartable, or otherwise `pure` is just masking the truth by saying that sea between islands is not important.

Not Just a Monad

First a Future is not a Promise, it is something different.

A monad is an interface, so a Promise can be a Monad, and be other things at the same time. To put it in very object-oriented terminology, a Promise implements the Monad interface.

So when people say "A Promise is a Monad" it is like saying "A Dog is an Amimal", however dogs have special features that generic 'animals' do not have.

Promises are not in general cancellable, JavaScript promises are not cancellable, and because a promise starts immediately you cannot really 'set the result' of a promise. You might be able to do it but its really undefined behaviour because the async process will carry on and call resolve eventually itself.

So in JavaScript calling either "resolve" or "reject" from outside a running promise is a bad practice, and not how promises were intended to be used. If you want something cancellable you want to use a Task instead.

In general a loop over a promise is a fold, and can be expressed like this in JS:

return list.reduce((promise, x) => {
   return promise.then(() => {
}, Promise.resolve())

Terminology woes

Note that different schools of terminology exist regarding the meaning of the future/promise distinction. And they occupy these terms in incompatible ways. One puts the distinction on whether the type exposes a resolution operation, another whether you can block on it directly or need to chain a continuation. And then there are others that don't follow either of these distinctions and just use the terms randomly or more or less interchangeably. Ironically, that includes the original papers from the 70s.

The argument appears to be

The argument appears to be that left-identity and associativity do not hold for every possible function. Therefore it does hold for some variations. In the end JavaScript is not strictly typed, so it cannot exclude these other variants. However the above criticism is probably valid for _all_ attempts to define a monad in JavaScript. In Typescript we could define a type signature for a monad, and a Promise would comply, it would just exclude the non-compliant use cases.

That is incorrect on both accounts. Monad laws and typing are orthogonal.

If JS promises had been designed properly and did not perform "recursive unwrapping" then they would indeed form a monad (and as a consequence, async would be a monad). This was very well known to the JS committee at the time, as this point was discussed to death (I was in those discussions). But in the end the fraction won who argued for "convenience", whatever that means.

Inversely, types can't rescue it, because you cannot actually type the `then` method once you have given it this bogus semantics. The monadic type you'd want -- and the one that TypeScript uses(*) -- is simply unsound. It would take some form of type complement (with a quantifyer such as in "Promise<T> for any T that is NOT a subtype of PromiseLike<U> for another U..."). But expressing such negative information is impractical in most type systems, because it interacts badly with type abstraction.

(*) TypeScript makes some weak attempt to cover more cases by typing it

Promise<T>.then<U, V>(
  onfulfilled? : (value : T) => U | PromiseLike<U>,
  onrejected? : (reason : any) => V | PromiseLike<V>
) : Promise<U | V>

but this breaks down if U or V are e.g. instantiated with some Promise<Promise<W>>, in which case the actual result value is a Promise<W> not a Promise<Promise<W>> -- provided W itself isn't again a promise type, and so on ad infinitum.


Seems a shame typescript's type system is not able to constrain this adequately, I see it cannot fix it entirely but it could disallow nested promises, It also seems a pity that JavaScript intoduces something that cannot be soundly typed, but then again it is a dynamic language.

I can use a promise like a monad though, so my comments that some promises are monadic still holds, and I could go a long time without ever realising that forms of promises exist that break the monad laws, as they seem to be completely unnecessary?

Edit: to clarify I mean some JS promises comply with the monad laws (the ones that do not use recursive unwrapping). As long as I do not use recursive unwrapping the promises are a monad.

The issue isn't that some

The issue isn't that some promises are monads but the type system isn't strong enough to separate them from the others. It's that no promises are monads [edit: actually, I don't even know what that means; it's the promise type function that could be, but isn't, a monad], because the definition of monad has a quantifier in it and if there's a counterexample for promises - other promises. If you had a language that helpfully flattened [1, [2, 3], 4] to [1, 2, 3, 4] automatically for you, then lists wouldn't be a monad either.


Doesn't the IO monad do exactly that?

f _ = putStr "2 " >>= (\_ ->
    putStr "3 "

main = putStr "1 " >>= (\_ ->
    f ()
    ) >>= (\_ ->
    putStr "4 "

Will print "1 2 3 4 "?

function print(x) {
   return new Promise((resolve, reject) => {

function f() {
   return print("2 ").then(() => {
      return print("3 ")

function main() {
   return print("1 ").then(() => {
      return f()
   }).then(() => {
      return print("4 ")

will print "1 2 3 4 " also.

So doesn't the call semantics of Haskell 'auto-unwrap' as well?

Monad Laws for JS Promise:

Left Identity:
    return Promise.resolve(x).then((y) => {return f(y)})
=== return f(x)

Right Identity:
    return m().then((x) => return Promise.resolve(x))
=== return m()

    return m().then( (x) -> {return f(x)} ).then( (y) -> {return g(y)} )
=== return m().then(  (x) -> {return f(x).then((y) -> {return g(y)})}  )
=== return (  m().then((x) -> {return f(x)})  ).then(y -> {return g(y)})

Wrong examples

Your examples don't exhibit the problematic case. That occurs when the content of the promise is itself a promise.

Specifically, substitute `Promise.resolve(z)` for `x` in your Left Identity law. Then JS violates the law, because what the expression actually computes to is `f(z)` not `f(Promise.resolve(z))` as the law would require. Right identity breaks in a similar manner, IIRC.

Another angle on this is that the JS promise semantics deeply violates parametricity.

Wrong Example?

Thats the wrong example, you should put :-)

Promise.resolve(() => {return Promise.resolve(2)})

Anyway, I do see that you can break the laws when you pass a promise because of the unwrapping. However as long as 'm', 'f' and 'g' in the above examples do not accept a promise as their argument, then the resulting code does obey the monad laws. I don't think I have ever written a promise like this, and a simple rule would be to avoid having promises as arguments to your building block promises. A type system with disequality could write:

type Promise<A> requires forall B . A != Promise<B>

Edit: TypeScript 2.8 can actually define the type we want:

type NotPromise<A> = A extends Promise<any> ? never : A

function unit<A>(x : NotPromise<A>) : Promise<A> {
    return Promise.resolve(x)

let a : Promise<number> = unit(2)          // okay
let b : Promise<Promise<number>> = unit(a) // error

This will not allow the definition of 'b'. Unfortunately TypeScript infers the type 'Promise<{}>' for all promises, so you have to give explicit type annotations for it to work.

Not sure

Not sure I understand your example, but note that both .then and Promise.resolve unwrap a promise argument recursively, which is what my version was exploiting.

Yes, I mentioned above that some form of type complement could express this, but type systems don't tend to have that (for good reasons). More to the point, a type constructor with a constraint on its domain wouldn't qualify as a monad AFAICT.

Edit in reply to edit: Wow, I wasn't aware that TS has added type-level conditionals now. From what I've seen about the theory of such features, I'm not sure how wise a move that is.


Well that wasn't meant to be a serious example, but it can be completed like this:

function unit(x) {
    return () => Promise.resolve(x)

function bind(x, y) {
    return () => x().then(y).then((z) => z())

function run(x) {
    return x()

run(bind(unit(1), (x) => {
    return unit()

        bind(unit(2), (x) => unit(x)),
        (x) => {
            return unit()

run(bind(unit(unit(3)), (x) => {
    return unit()

    (x) => {
        run(bind(x, (y) => {
            return unit()
        return unit()

Basically using a function to prevent recursive unwrapping.

In summary const was right when they said that a (JavaScript) promise is approximately a monad, but for the wrong reasons. The additional API functions do not stop a promise from being a monad, and in general there are promises that are monads, just not the JS ones. If only the committee had listened to Andreas.

TypeScript Soundness

TypeScript deliberately does not try to be sound. Thats why it does not do HM type inference, and instead only infers the 'outer' types like 'Promise' for all promises.

If Godel tells us a system cannot be complete and consistent, TypeScript have deliberately gone for complete, and sacrificed consistency, I would not be surprised to find a contradiction in there somewhere.

ephemeral global state

Ephemeral state is indeed useful! Especially for transitory features of low information-relevance, like tweening animations.

But ephemeral vs persistent is on a distinct categorical axis from local vs global. We aren't limited to "global persistent" vs "local ephemeral". For example, we can model "global ephemeral" states like a publish-subscribe bus where old published records expire, and fresh records must be continuously published. Or we can mount shared-memory volumes as a temporary, ephemeral filesystem. Conversely, we can roughly model "local persistent" states using techniques like Smalltalk images.

As far as "restarts" go: even with persistent state, it can be useful to model resets for volumes of state - e.g. clearing a volume of state to default values, capturing or restoring a checkpoint. I explicitly mention this in the article (look for term 'reset'). A restart can easily and explicitly be modeled as a reset caused by an external condition. In general, I would posit that restarts/resets are something we should explicitly model and support within our software system definitions or paradigms, not leave entangled with an ad-hoc OS process model. Ad-hoc restarts don't work nicely for large scale, long-running systems such as distributed apps or overlay networks, and become difficult to reason about.

In my experience, it's convenient if "ephemeral" is a semi-transparent property for a subset of stateful resources. By semi-transparent, I mean: a program may assign an ephemeral partition of state resources to a subprogram, but the subprogram cannot directly check whether given state resources are ephemeral. This makes it easy to support fine-grained orthogonal persistence, and fine-grained tradeoffs between durability and performance. Similarly, when I implement a transactional update model, I allow individual transactions to be marked 'durable', which determines whether we must wait for sync upon commit. (Non-durable transactions that only touch ephemeral state resources are effectively a software-transactional memory, which can be useful.)

In the end, I feel your "complete disagreement" builds upon a simple category mistake, whereby you wrongly conflate (or causally correlate) 'global' with 'persistent'.


I am interested in your approach to state and I think this is relavent to compositionality. I cannot get past the performance aspect of global state. Surely on multi-core processors (Threadripper has 64 threads on 32 cores, next year AMD will have 64 cores) global state becomes an unacceptable performance limitation?

I can understand a small website keeping all its state in a database and having a stateless webserver/service, but as you scale up again you want multi-master databases with eventual consistency.

It seems to me that local state (perhaps something like the actor model, or a Kahn process network with bounded queue size, where each node has state) is what you want for parallel scalability?

re: (parallel) performance

Local state does simplify local reasoning about parallelism. But whether this improves actual parallelism is a separate question.

In practice, we try to partition global state resources and partition our computations in ways that roughly align. For example, with tree-structured state, we can assign different subtrees to different subprograms or services. For a publish-subscribe state resource, we might partition it into channels and topics that are each processed by only a few agents or subprograms. For an entity-component system, we might partition state into 'component' tables, and arrange for most update tasks to only touch a few tables. At very large scales, these partitions might be physically distributed, migrating data or computations as needed.

Insofar as this alignment of partitions succeeds, we will discover opportunities for parallelism. We cannot reason about this locally because it depends on non-local configuration decisions, but we can still achieve a lot of parallelism in practice.

Also, there are many opportunities for parallelism in stateless computations, e.g. collections processing. In my experience, even with Actors model systems, this is often the greater source of parallelism.

I don't believe that local state is offering sufficient advantages for parallelism, especially not when we consider that runtime extension and upgrade also become significantly more important at larger scales.


Consider designing something highly parallel like a modern pipelined CPU. We need state after each pipeline stage to locally store the computed result before the next pipeline stage. It would take a thousand times longer to store this back to main memory between stages, ruining the performance. Actors can model this kind of system precisely because they contain state, and an actor can model the pipeline register as well as the combinational logic that computes the next pipeline stage. This kind of local state is not even accessible from outside the device, and nor would you want it to be.

If we consider compiling code to an FPGA then we would want similar local state to pipeline computations.

Similarly in a shared nothing distributed computer (like most modern super-computers) each node computes locally and data is streamed between nodes.

I don't believe that local state is offering sufficient advantages for parallelism

These are all examples where local state enables significant parallelism. I don't think a CPU could operate realistically without it, I do not think a large super-computer cluster like used in weather forecasting or atomic simulation could operate efficiently without it.

I agree with the original article that composition is an important property for software and hardware. I can see how that works with pipelined computation.

I find your logic on global state compelling. I would like it to work, but I cant seem to make it fit the use-cases I am looking at.

re CPUs

A system with a large number of CPUs can usefully be treated as a simple distributed system. Latencies are non-uniform due to cache, NUMA, network, etc.. Attention to this "physical locality" is relevant for performance, but is not the "local state" opposed in the article.

For global state resources, I usually model distributed systems by introducing logical latencies (and sometimes disruption) between state partitions. I've also experimented with modeling GPGPU partitions that constrain data types and computations. Modal types (such as Int@Here vs Int@There) also seem potentially useful to work with explicitly distributed and/or staged computations, but I haven't figured out how to best use them.

Global state does not imply random access.


So, could global state have a non-uniform distribution? For example consider a 1000 node cluster each with "local" storage. Could we model this as a tree / file-system with a top level directory that contains 1000 sub directories, each being the local state of each machine?

Would the logical conclusion to this be effectively two declarative languages? One is the data/state, and the other the pure/combinatorial functions and assignments?

Would you have something like this for initial state:

/a = 1
/b = 2

And this sort of function:

/c := /a + /b

It does remind me of Prolog and DataLog, which are languages I like. Prolog treats everything as one global state space.

re StateSpace

could global state have a non-uniform distribution? For example consider a 1000 node cluster each with "local" storage. Could we model this as a tree / file-system with a top level directory that contains 1000 sub directories, each being the local state of each machine?

Yes! If we intend to be more precise, we could instead model a logical network, defining available edges in a graph. But a tree is a very simple and convenient approximation.

The state is "global" in the sense that the tree or graph of state resources is external to the computation, yet "local" in a physical/performance sense to computations performed at a particular node. My article regards only the former.

The ambiguity of natural language can too easily lead to confusion. In hindsight, I might have been wiser to use different adjectives. Perhaps "external vs internal" state, distinguished from "centralized vs distributed" state and "ephemeral vs persistent" state as three orthogonal axes. Then my article would favor external state and be neutral regarding the other two properties.

*Aside:* physically local vs remote is a relative property in context of a distributed resource model (e.g. this data is near that processor). In comparison, distributed vs centralized are better adjectives to describe a system as a whole. We can't reasonably discuss physically local state without first having a model of physical distribution.

Would the logical conclusion to this be effectively two declarative languages?

I do not believe that externalizing state strongly implies a declarative computation model. For example, entity-component-system effectively externalizes state (into component tables) from computations (the system updates). But ECS commonly uses an imperative system update model.

If we do go declarative, it would likely be a more sophisticated model than you describe above. For example, it's very convenient if we can model many-to-one "inputs" whereby separate expressions contribute to a shared value or decision (i.e. concurrency). We could do this by having expressions that "add" to a set or other commutative monoid, for example. For modeling stateful applications, it's very convenient if we can either refer to 'prior' states or contribute to 'future' states, using some variant on temporal logic.

In the vein of Datalog, you might obtain some inspiration from Dedalus (Datalog + Time) and Bloom (Dedalus + Partitioned Space and more). But I'm not prescribing any specific model. Even favoring entity-component-system is a pretty good step in the right direction, IMO.