2 Misconcepts About Functional Programming (relating to context and monad)

Candle's last release announcement to LtU triggered some interesting discussion about whether the functional features in Candle are pure. I finally had sometime to write more about my thoughts on this issue. You can find the full content in this blog article.

In the blog, I talked about 2 misconcepts about functional programming. The 1st misconcept is that dynamic scope variables are not functional. The 2nd misconcept is that monads are functional.

In the blog, I coined a new concept called contextual functional, which extends the more strict pure functional concept.

In the following blog article, I also talked about the procedural design in Candle, including concepts like separation-of-side-effect and a 3 layer-architecture:

  • A functional core
  • A concurrent execution model
  • Shared and mutable state management

Your feedback is welcome!

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.


Interesting blog post. I disagree with you on both points.

In terms of monads being function I think Simon Peyton Jones is absolutely correct.

1) Functional programming happens in the world of pure timeless abstract mathematics where platonic ideal data is fed into stateless functions to produce results that are timeless.

2) The purpose of a computer program is to produce a side effect often dependent on ambient state.

Monads are the means by which the paradigm of (1) expresses itself in the world of (2). Monads are the medium of expression. Without the medium the mathematical expression is empirically unknowable. The expression must take on the the properties of the medium in which the ideal needs to be expressed.

So of course monads aren't purely functional, they are where purely functional code finds expression in the actual world. But that definition doesn't make them not part of a purely functional language. What makes a language purely functional is not that it can't carry out side effects but rather it makes side effects painful so the vast bulk of the code is side effect free.

As for the first point. Of course the universe is not purely functional. Once data is passed out of the purely functional part and into the universe it becomes tainted. Purely functional program reasoning treats all code that exists in the actual physical universe as tainted. So passing data around via. a container makes it immediately obvious that the data is not being preserved between function calls. That's all purely functional programming can do, make it explicit what can't be controlled.

Finally on the example of large quantities of data. There are no limits on the size of the state being weaved through the code via. the State Monad. If you mean a large block of unchanging data, an object with a bunch of methods works fine and just pass the object, and pick off specific details via. the method calls. Haskell allows you to define arbitrarily complex constants so you are only passing one variable and that only to top level functions.

Unless I missed something,

Unless I missed something, neither of your points is addressed in the blog post. I'll address them:

Dynamic scope is functional

What's your definition of "functional"? Mine (and I suspect most people's) includes referential transparency, and dynamic scope precludes referential transparency. Unless your definition of "functional" or "dynamic scope" differs from the common one, this statement is false.

On the topid of definitions, in the previous discussion you mention something about a Candle program's main function being referentially transparent -- this is true of imperative languages so long as they don't to any I/O, so this isn't really saying much. Referential transparency is only useful if it's global.

Monads are not functional

I'm not sure what your definition of a monad is. Mine is a data type M which has defined two functions, return: a -> M a, and bind: M a -> (a -> M a) -> M a (or alternatively, return: a -> M a, join: M (M a) -> M a, and fmap: (a -> b) -> (M a -> M b)). These are just restrictions on the kinds of data types called monads. In an otherwise functional language, you cannot produce non-functional semantics by restricting to the language.

That the IO monad in Haskell performs IO is irrelevant. I can implement a replacement IO module using pure constructs to say, model a world where the standard input stream consists only the text of Fresh Prince. Every program will behave the same as if I ran the program with the actual IO monad and echoed that same text to Fresh Prince. The only difference is that use of the actual IO monad means a program might produce a different result on each run. This is expected and desired behavior in programs performing I/O.

What's way more important to the notion of "functional" is that code is referentially transparent within a given run, not when run with a different world state.

I/O Monad is not Referentially Transparent

Let me poke the hole even bigger. As I think deeper about I/O monad, I can claim that it is not RT. Here's my prove:

  • As I mentioned in my blog, and I think everyone would also agree, for a function to be RT, it's output must not overlap its input. In other words, the output must not change the input during the execution of the function.
  • Functions using I/O monad in Haskell are declared as:
    main :: RealWorld1 -> ((), RealWorld2)
    ... etc.

  • The question now is: Does the output RealWorld2 overlap the input RealWorld1 in monadic I/O functions?

    I think most functional programmer won't able to answer that, because this RealWorld is just something magic, opaque.

  • My own thinking is that output RealWorld2 definitely overlaps, i.e. changes, the input RealWorld1 in some way, thus I declare i/O monad non-RT.

If we bend the rule to allow a pure (i.e. RT) function to mutate its input in someway, then all procedural functions can be called pure function.

Let me know if you think my proving is wrong in any way.

You can't just make stuff

You can't just make stuff up. Referential transparency has a well-defined and accepted definition: a function is referentially transparent if and only if it produces the same result given the same input arguments.

A function using the I/O monad meets this definition. That "RealWorld1" in your example is ensured by the language not to occur anywhere else in the program and can thus safely be discarded by the implementation is irrelevant to the definition.

Furthermore, the I/O "world" value need not be "magic" as you put it. It could perfectly well be implemented with a cache so that extra-monadical functions (i.e. those not using the monadic interface), with types such as IO x -> x, may be exposed. This isn't just make-believe; Mercury (a pure language which uses unique types instead of monads to enforce I/O linearity) has a time-traveling debugger which caches I/O results so that a program may be replayed.

That RealWorld2 "overlaps" RealWorld1 is merely an implementation detail, a shortcut the compiler takes because it knows that linearity is enforced by the monad structure. The semantics of Haskell and monads are in no way affected by this optimization (see the constructive proof in the previous paragraph), meaning that this behavior has no bearing on the referential transparency of the I/O monad.

May be I'll call it 'Pure Computation'

Thanks for all you detailed comments and feedback. I've written a new blog article in response.

In brief, it talks about a new term, I called "Pure Computation", which I hope can better describe my intention.

Here's the definition - a function may be described as pure computation if both these statements about the function hold:

  • The function always evaluates to the same result value given the same argument value(s) and the same set of context value(s). The function result value cannot depend on other hidden information.
  • Evaluation of the result does not cause any semantically observable side effect or output through the direct interface of the hosting world.

I've highlighted in bold the parts that are in contrast to Wikipedia's definition of pure function.

Adding dependency on context values is to help include contextual features in XQuery and XSLT. The condition that side effect must be observed through the direct interface of the hosting world is added, so that we cannot be fooled by the threading RealWorld. When we talk about side effect, our primary concerns are of course related to the real hosting world instead of a virtual RealWorld that the compiler makes us believe. This condition is to help exclude Haskell kind of monadic I/O implementation.

IO Monads

I think you are slightly confused about the IO monad.

Building a value of type (IO a) is exactly that: you build - in a pure functional language - a value of type IO a.

Exactly as you would build a String or a List of Int.

The use of the value by the runtime will produce side effects.
But the creation of something of type IO a has no side effect.

In a Haskell program, you only create such a value of type IO a. Consequently, the program is built with pure functions.

IO Monads are different from other monads

In theory, yes, I/O monads look the same as other monads.

But when comes to reality, there's distinctive difference between I/O monads and other monads. I/O monads have sides effects on the hosting world. Other monads do not. That's why in Haskell, I/O monads have to start from main.

To me, calling one function 'pure' but at the same time allows it to perform side effects is just fooling oneself.

I have no doubt that monad is a great theoretical tool. When I/O monad is carried out loyally (without destructive side effects), it can result in novel system. For example, a Versioning File System, in which every change to the file system does not modify any existing file, but creates new file and a new version of the system. I'll happily accept such system as pure.

When it comes to Haskell's way of doing I/O, developers are depending on a specific implementation feature rather than the actual theory.

And as I said in my blog, I/O monad is already out-of-date. It's too low level. It's not the best way to model side effects today. I think theory wise, XQuery Update is a better model. The operations in the pending update list can be reordered and merged. It gives more freedom to the implementation to implement fancy features like concurrent update.

Monads are a way of

Monads are a way of intercepting a sequence of applications of functions on a value that gets passed along the function chain. The I/O monad isn't different from other monads in this respect. That is, if you built a monad that implemented side-effects as a pending update list, you would be reinventing an implementation the I/O monad.


By now, I think monads are a manner of making sure absolutely noone has a clue what they are talking about.

The trouble with monads

Liked that

That was an interesting read. Personally, I think that if one concentrates on the type signature, by heart: (a -> M a x M a -> (a -> M b) -> M b)), and multiply that will all possible perspectives on that (programmer's point of view, cata-gory theorist's point of view, set theorist's point of view, algebra theorist's point of view, type theorist's point of view, layman's point of view, Java programmer's point of view, imperative functional programmer's point of view, ...) you'll end up with the conclusion that there can be no possible end to the mess introduced.

(Tongue-in-cheek mandatory when reading my satirical comments.) ;-)

I think many people get too

I think many people get too caught up in specific implementations of the monad pattern. It's quite abstract, yet very simple; but a lot of people don't quite burrow down to the root abstraction, especially if they start from the IO monad. So they end up thinking the IO monad is things it's not, like impure.


The fact that people found a manner to purely compose (impure) programs with side effects gives them all right to think that it is an impure construct. At least, when run, an IO monad value is side effecting/impure.

As for the rest, you don't need an IO monad to purely compose impure programs, so whatever.

(Monads are a good idiom for composing stuff, but the IO monad should be seen as a magician's trick: Haskell carefully hides a necessary construct -a run function- inexpressible in the base language from view. So you end up discussing the wrong theories, since any comparable trick -hiding a run function from sight which interprets a pure value- would do too, at least in theory.)

Monads could help you

I am getting a mixed feeling from your blog posts. The first reaction is to say that they are about two misconceptions of yours; clearly your are wrong about some of the things you say.

On the other hand, what you say is not empty; you are trying to make a finer distinction about "contextually referential computations" that, I think (I haven't yet taken the time to understand it well), has value.

I think it could help your point to actually define precisely the program transformation you are talking about. It will probably be a monad. If it is, you can compare it with other monads to see if it can be described as having "less" effects, eg. as the Reader or Writer monads have "less" effects than the full State monad. You would gain a more precise description of the structure of your restricted effects, which is currently very hand-wavy. I realize it must be feel precise to you as you are familiar with the specialized domain, but it is still hand-wavy.

Yes, I agree monad can help

Yes, I totally agree monad can help. One day in future, Candle may provide monad support. But routines with actual side-effects will still be separated from routines without side-effects. I/O monads will still be separated from other monads.

Regarding "contextually referential computations", I have no doubt it can be modeled in some form of monad. As I'm not so familiar with all the monads in Haskell, I cannot propose the solution here. As you said, it will be something that has less "effect" then State monad, as it has a more precise scoping than State monad. But that is just to satisfy the theoretical probe.


I personally think everyone is confused about transparency. In my view this is a property of concrete syntax.

Consider a compiler with a high level layer of sugar that translates C into Haskell, and a low level layer that translates Haskell into Assembler.

Clearly, C and Assembler aren't transparent, but we'd like to think Haskell is.

What this means is that monads may be functional but if you add syntactic sugar to make the use appear more conventional or imperative, the result is not. The syntax, not the semantics, are what matter.

After all it is well established functional and imperative code can be translated into each other.

Global syntax transformation is Semantics

As you have shown with your example of C-to-Haskell-to-Asm compiler, all the semantics difference between two languages can be bridged by successive rewriting passes. In fact, if you target an end language that is "purely semantics" (eg. a Turing Machine, or a minimalist lambda calculus), you can define the semantics of your language by this transformation process; this is called semantics by elaboration.

So for certain ranges of syntactic phenomenons, you cannot say that they are "syntax, not semantics", because they are syntax and semantics.

Still I don't think transparency is one of those, as you claim. An essential property of syntaxic sugar is that it is a local syntactic transform. You have a (usually small) part of your program that get rewritten in a certain way, and more importantly this rewriting is "simple", "homomorphic", "continuous" in a certain sense: the rewriting of a composed construct is a composition of the rewriting of its parts, nothing more (in which you could sneak in more semantics).

It is perfectly correct, in my opinion, to claim that some part of Haskell code using the do-notation is not "referentially transparent". That is, you choose to think at the level of the apparent surface effectful language, rather that at the level of its pure translation into "base Haskell". Monads were presented by Moggi as a way to translate from effectful to pure languages.

However, not all your program is in this form. The vast majority of the code is still not in the do notation (whether it uses the `>>=` and `return` operators or not), and this code is to be understood with a pure/transparent semantics.

If you translate C into Haskell, you can choose to embed all your program in the `do` notation to keep a similar structure, and you get an imperative program, or to translate all programs with explicit state-passing, and you get a pure/transparent program, because your global "syntactic" translation had a semantics meaning.