Explaining monads

I am off to bed, so I'll get back to this tomorrow per Luke's request.

For the time being I recommend section 2 of Wadler's Monads for functional programming.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Still don't quite get it

I read the paper, and I thought I was beginning to understand what monads are. Then I lost it again. Monads seem to be presented as a way of avoiding side-effects. This doesn't seem to be true. From what I've seen they are merely a way of delaying side-effects until some later time. For instance, reading about the IO monad in Haskell gives me the impression that instead of actually performing I/O, you create a computation which when executed will perform the I/O. This computation is then returned as the value of the main value of the program. But then what happens? "And then a miracle occurs?" Presumably, this computation is then executed in an impure fashion? So rather than allowing Haskell to do I/O, this allows Haskell to create impure programs which do I/O and hand them off to some other language interpreter. Is this right? If so, isn't it a bit of a cop-out (a tacit acknowledgement that pure languages are of limited use in an impure world)? If not, what am I missing?

I can see the utility of using monads for simulating state, as a design pattern. This seems to work as you really can avoid side-effects this way. But I/O is side-effects, so there's no getting around it. So, what is the advantage of using monads for I/O? You eliminate side-effects from the Haskell program, so it is still nice and easy to reason about. But then don't you still have the problem of reasoning about what the monad will do when it is executed (outside of your program)?

sequencing

Monads seem to be presented as a way of avoiding side-effects. This doesn't seem to be true. From what I've seen they are merely a way of delaying side-effects until some later time.

Yes, they do delay the side effects until running the computation, but they also explicitly sequence the effects. That's the common feature of all monads: explicit sequencing (just like CPS).

For instance, reading about the IO monad in Haskell gives me the impression that instead of actually performing I/O, you create a computation which when executed will perform the I/O. This computation is then returned as the value of the main value of the program. But then what happens? "And then a miracle occurs?"

The "miracle" that occurs is no different from the miracle that occurs whenever any language has I/O: it always has to be implemented by some underlying system. That's not the problem the I/O monad is solving. With laziness, it's hard to know what order things will happen in. The I/O monad sequences the effects so you know they'll happen in the right order.

As for when it gets run, every monad comes with some sort of a "run" operation, which goes from Tα -> α. This is the only way to "get out of" the monad. All the monads in Haskell have some sort of library function for running the computation, except for the I/O monad. The only way to run the I/O monad is to run your entire program (from the command line, or the REPL, or whatever). So in this case, the monad's "run" operation corresponds to the program's "run" operation.

If so, isn't it a bit of a cop-out (a tacit acknowledgement that pure languages are of limited use in an impure world)? If not, what am I missing?

I don't like all the talk of "purity." I/O isn't bad, it's just hard to reason about in conjunction with a lazy language. Shutting all I/O into the monad means the monad laws will hold of all I/O computations.

I can see the utility of using monads for simulating state, as a design pattern. This seems to work as you really can avoid side-effects this way. But I/O is side-effects, so there's no getting around it. So, what is the advantage of using monads for I/O? You eliminate side-effects from the Haskell program, so it is still nice and easy to reason about. But then don't you still have the problem of reasoning about what the monad will do when it is executed (outside of your program)?

Again, it's not about eliminating I/O, it's about forcing it to obey certain algebraic laws.

Ahhh!

Thanks! Now, I think I finally understand.

IO violates RT

I don't like all the talk of "purity." I/O isn't bad, it's just hard to reason about in conjunction with a lazy language. Shutting all I/O into the monad means the monad laws will hold of all I/O computations.

It doesn't matter if a language is lazy or strict, invisible IO (and other invisible side-effects) violates referential transparency.

IO monads

So, let me get this straight - when running a Haskel program that uses IO monads - does it do the entire calculation and then only finally 'execute' the monad outputting everything in one go? or is there some clever trickery in the compiler/interpreter that lets it perform the IO 'as it goes along' ?

Trickery

Because IO obeys the algebraic monad laws, the compiler can safely transform functions that create IO actions into code that directly performs IO. This code then "drives" the evaluation order of pure lazy functional code.

There is a good paper about how IO monads are transformed and optimised in the GHC compiler, but I don't have the reference handy.

yes, sort of

Conceptually, the description "evaluate a function to get an IO program, then execute the IO program" is correct (with the caveat that the generated IO program is often infinitely large). Due to the lazy evaluation of Haskell, both parts are actually interleaved, so the operational description is more like "evaluate enough of the function to know the next IO step, execute the IO step, then go back to evaluating".

It's not laziness but functional abstraction

See here.

Meaning the monad constructs

Meaning the monad constructs an imperative program to do the IO.

See "All about monads"

http://www.nomaware.com/monads/html/index.html

This is the explanation that finally did it for me.

Don't get too hung up on the IO monad, because IO is a rather atypical special case. Most monads have a "run" function that takes a monadic action and executes/evaluates it (e.g. runST, runState, runCont). IO doesn't because the whole program is an IO action.

Within a monadic action (which is composed of smaller actions) side effects happen. To be more precise, the monad defines a context through which the effects propogate from one sub-action to the next. For example within the "State" monad the state value is passed along as a hidden parameter by the "bind" function. This hidden parameter is accessed by "get", and replaced by "set".

The ST monad is similar, except that instead of a fixed state value you have a dynamic heap to play with. You could almost simulate it in the State monad using a resizeable array and passing out index values, except that garbage collection wouldn't happen. Once you have constructed your ST action you call runST on it. From the outside runST is a function: it preserves referential transparency, and if you call it twice with the same arguments you are guaranteed to get the same result. On the inside you have side effects propogating through the ST implementation of "bind", so you can write imperative code.

IO is like ST, except that its not just the state of a heap that gets passed through "bind"; the state of the world gets passed along as well. There is no runIO function because the world cannot be passed around like a normal parameter. Instead you run the entire program.
(Of course the infamous "unsafePerformIO" looks a lot like a "runIO" function, but it doesn't preserve referential transparency).

Paul.

I have not yet found a better tutorial...

Than the one here. This is linked from the TUNES wiki node for Monads; please add useful links and information there.

Clean world?

The paper talks about SML & Scheme as impure, and Haskell & Miranda as pure. No mention though of Clean's use of world to isolate side effects.

So where would Clean fall in this list? Are the world's in any way considered monads?

Clean has unique types

Every function that wants to change the world must get a world parameter and return the changed world. As the world type is linear the compiler will flag another usage of the old world:


-- * says the type is unique or linear
-- (i.e. can be consumed only once)
fwritec:: Char *File -> *File

WriteABC:: *File -> *File
WriteABC file = fwritec 'c' (fwritec 'b' (fwritec 'a' file))

--or
WriteABC file = file3
where
file1 = fwritec 'a' file
file2 = fwritec 'b' file1
file3 = fwritec 'c' file2

This chapter of the Clean 2.0 language report contains the relevant information.

Unique types are quite powerful and with them we can write the IO datatype in the language (instead of having it as a primitive) and it would be like the state monad.

Monads are a way to structure programs, in particular the IO monad in Haskell is a way to model IO. Unique types are another (orthogonal) way to structure programs, being used in Clean to model IO.

linearity and monads

Unique types are another (orthogonal) way to structure programs, being used in Clean to model IO.

The connection:

Imperative Functional Programming

Rolling Your Own Mutable ADT -- A Connection Between Linear Types and Monads

Concurrency?

Sometimes I think that discussion of Monads gets too intertwined with the Haskell syntax for it. Figure it doesn't hurt to look at it from another language which uses a different syntax & vocubalary. From the Clean perspective, I gather the Unique type is tied with the concept of sharing and locking of types.

Would it be fair to say that Monads are also tied to the issue of Concurrency? (Not just during runtime, but also compile time?)

Schemer's introduction

I don't mean to scoop you, Ehud, but ever since Luke's comments yesterday I've started working on a Lispy (OK, Schemey) introduction to monads. It's just a first draft, but my aim is to build the basic ideas from the ground up. It could use help, so suggestions greatly appreciated.

Luke, let me know if this is at all helpful. Don't be nice -- if it's useless, just say so. :)

Meantime, I've found All About Monads pretty approachable in the past, although Haskell comes with a bit of baggage for learning monads (esp. type classes and the strange and misleading IO monad).

Very good!

Thank you Dave!

I seem to understand Monads now after reading your article. Of course, I've said that before, so I'd better do an exercise to be sure :-). I tried to write the "collect even and odd numbers by effect" example from a recent discussion. I think my solution is close, but it gets rejected by the type checker and I can't figure out why:

import Monad
import Control.Monad.State

-- Separate a list of integers into evens and odds.
-- Return the results in an (evens,odds) tuple.
evens_and_odds :: [Integer] -> ([Integer],[Integer])
evens_and_odds list =
    state where (_, state) = runState (do mapM_ collect list
			                  return get) ([], [])

-- Collect an integer into the even or odd part of the state.
collect :: Integer -> State ([Integer],[Integer]) ()
collect n = do (e, o) <- get
	       if n `rem` 2 == 0
		  then put (n:e, o)
		  else put (e, n:o)
	       return ()

main :: IO ()
main = do putStr $ show $ evens_and_odds [1,2,3,4,5,6]

Can someone spot the bug - am I close? And how would a real Haskeller have written that (assuming an irrational desire to collect by effect)?

Anyway, the article is very good: it clearly explains (to a Schemer) what a Monad is and what it's for. The final monadic code is still fairly ugly and is doing something quite unsexy so I would have been unsatisfied if you hadn't included the link to All About Monads. That catalogue of Haskell monads and their examples seems like exactly the right next step from understanding the mechanics.

Thanks again!

IIUC (I can't run your exampl

IIUC (I can't run your example right now) the offending line is this "return get". Here "get" has type "m s" while return has type "a -> m a" so here you are trying to return the state transition "m s" instead of the state. You can do two things:

1. Substitute "return get" by

s return s

or even simply

get

which is equivalent.

2. Ignore this, because you don't need to return the state explicitly. The "runState" call has type "(s -> (a,s)" and since you're applying this to the initial state ([], []) it will return the last value (i.e. () from the mapM_ call) and the last state (i.e. (evens, odds)).

Also you don't need to write "return ()" in the end of "collect" because "put" has type "s -> m ()", that is the last expression in the do block (i.e. "if ...") has the correct type and "return ()" is just a no-op.

Great

Thanks, that gets it working:
evens_and_odds :: [Integer] -> ([Integer],[Integer])
evens_and_odds list =
    state where (_, state) = runState (do mapM_ collect list) ([], [])

collect :: Integer -> State ([Integer],[Integer]) ()
collect n = do (e, o) <- get
	       if n `rem` 2 == 0
		  then put (n:e, o)
		  else put (e, n:o)
Obviously I'm not on top of this yet, but now I've got a foothold and I'm interested.

If someone could next show what makes type systems so interesting in terms that I'll understand then I'll be well on my way into the modern functional programming world.. :-)

macros

The final monadic code is still fairly ugly...

You're absolutely right -- I forgot the punchline for the Scheme version of the story. Monadic code is ugly precisely because you're making the threading of control explicit. Worse, you have to give a name to the result of every intermediate computation (hint: this is a big link between monads and A-normal form). The designers of Haskell added the do-notation as syntactic sugar for easing some of the notational overhead in the explicit sequencing. (There's not much you can do in general about the nuisance of intermediate names, though.)

Schemers don't need to ask the language designers for syntactic sugar, though: macros are our own personal confectionary. To complete the rand example, here's a quick monadic compiler (very minimally tested):

(define-syntax <get>
  (syntax-rules ()
    [(<get>) (get-seed)]))
(define-syntax <set>
  (syntax-rules ()
    [(<set> v) (set-seed v)]))
(define-syntax <return>
  (syntax-rules ()
    [(<return> v) (lift v)]))
(define-syntax <let>
  (syntax-rules ()
    [(<let> (a m1) m2)
     (pipe m1 (lambda (a) m2))]))
(define-syntax <begin>
  (syntax-rules ()
    [(<begin> m1 m2)
     (pipe m1 (lambda (ignored) m2))]
    [(<begin> m1 m2 m3 ...)
     (pipe m1 (lambda (ignored) (<begin> m2 m3 ...)))]))

Notice that we no longer need the earlier definition for begin, because it's just a special case of pipe. Finally, we can rewrite rand:

;; -> T(number)
(define (rand)
  (<let> (seed (<get>))
    (let ([ans (modulo (* seed 16807) 2147483647)])
      (<begin> (<set> ans)
               (<return> ans)))))

This should look familiar... If you're thinking, "well that's pretty much just what we had to start with when we allowed ourselves to use set!," then you're right. The idea is to simulate effects using a pure host language. If you already have the effects, there's no need to simulate them.

No magic?

So is it the case that given a lazy functional language with primitives for unrestricted side-effects one can still write real imperative programs (e.g. webservers) by programming in monadic style? And thus Haskell's Monad support is "just" a type system to enforce this style plus some syntactic sugar?

Re: No magic?

From my naive viewpoint, yes. This is even "admitted" in papers from various Haskell gurus, with a little note about how imperative programming in Haskell is safer than in most languages :)

Here's an interesting and easy to follow paper, though not the one I was looking for: Lazy Imperative Programming

Re: No magic?

given a lazy functional language with primitives for unrestricted side-effects one can still write real imperative programs

Depends what you mean by unrestricted, and probably depends on the language/implementation. Haskell implementers warn that you can't rely on the unsafePerformIO extension, a primitive that produces side effects outside of the monad. So it's not just a matter of throwing in some side effects and then the programmer can gain control of them.

What goes on behind the scenes is that the implementation tracks the primitives which produce arbitrary effects, and expressions built from them. It ensures that the primitives are invoked in exactly the sequence determined by the program. If this mechanism were made to sound more impressive, maybe it would qualify as magic.

You can create a web server in Haskell, but due to the IO monad putting every action into a sequence, you don't get a typically nice multithreaded web server. For that kind of thing there's an extension called Concurrent Haskell.

Order of evaluation, or, some magic?

I don't understand how Monads enforce order of evaluation. Dave's article shows that each operation takes the result of the previous one as an input, which is obviously some kind of dependency chain. It would be enough to handle Scheme's "strict, but function arguments can be evaluated in any order" rules to write a working begin function. But how do we know that an (e.g. lazy) evaluator will make the real-external-world effects happen in the right order

It sounds like you're saying this aspect is handled specially by the compiler/runtime system, is that right?

Dave's thesis committee will certainly want to know this. :-)

The New World

...each operation takes the result of the previous one as an input, which is obviously some kind of dependency chain. It would be enough to handle Scheme's "strict, but function arguments can be evaluated in any order" rules to write a working begin function. But how do we know that an (e.g. lazy) evaluator will make the real-external-world effects happen in the right order

Laziness works by wrapping constructors with a delay and selectors with a force. In other words, when you actually need to extract a piece of a value, you have to make sure the value has been computed.

So one way to ensure something gets computed in a lazy program is to select a piece of its result. (Of course, if that part of the program is delayed, i.e., the part doing the selection, then it won't be computed, nor if the part computing the part computing the part... and so on. At the top-level, a lazy program has to be kicked off or nothing will ever compute.) As you say, a program in monadic style builds up a chain of dependent computations: each successive computation requires the result of the previous computation before it can be executed. Then in the top-level (i.e., the monad's run function), it extracts the result of the entire computation. Since this result isn't available without running the last computation, the entire chain must be evaluated.

In the case of the IO monad, check out Imperative Functional Programming (link above), esp. sections 2 and 4. The idea is to thread a dummy accumulator that stands for "the state of the world" through the computations. This dummy is encapsulated in the abstract IO data type, so the programmer can't touch it. Since a non-IO function can't call an IO function, the only way IO operations can be kicked off is from the top-level main function. The intuition is that the REPL hands the current state of the world to main, which threads the world through its IO computations, and ultimately returns the modified world when the program completes. To kick off the entire computation, you can imagine the top-level asks for the final state of the world in order to force the computation to take place.

This chain of computations also enforces linearity, meaning that the modified "world" resulting from an IO computation will only be accessed once, namely when it is handed off to the next computation in the chain. This allows the Haskell implementation to perform the actual real-external-world operation in-place when it executes an IO computation, since it knows that this is the only place in the program where that state of the world will occur.

Blindingly simple...to non-FPers

The idea is to thread a dummy accumulator that stands for "the state of the world" through the computations.

And that's trivially easy to understand in terms of low-level programming. Once you start talking about the three laws, etc., then you've lost most people. It's like saying "now the first thing you need to understand about iteration is what a loop invariant is," when what programmers want to hear is "this is how you write a while loop."

Dave's thesis committee will

Dave's thesis committee will certainly want to know this. :-)

Are any of them still alive? I heard there was some kind of "accident" with a begin...

Monads are like CPS in this sense

If we convert a piece of code to CPS we can enforce the order of evaluation (specially useful in a lazy language like Haskell). Monads have a similar behavior. The type of bind is "(>>=) :: (Monad m) => m a -> (a -> m b) -> m b" which says: take a computation and a continuation and return the computation resulting from applying the continuation to the result of the given computation. That forces the left value of bind to be evaluated before the right value.

Also the monadic do notation translates to a series of binds and they look quite like continuations:



do x = (\x -> my >>= (\y -> return (f x y))

Monads and Scheme

Interesting introduction.

The paper Abstraction and Performance from Explicit Monadic Reflection" by Jonathon Sobel, Erik Hilsdale, Kent Dybvig, and Dan Friedman is quite interesting. It shows how to transfer the monadic paradigm to Scheme in a very way.

Other texts on monads and Scheme can be found at IdiomMonadicProgramming at the SchemeCookBook.

I guess I am redundant

Seems likes yet another explanation of mands isn't really needed, so I won't bopther you with one...

Not so

I took your advice and started reading Monads for functional programming. I'm taking a break now after section two but it's a marvellous paper, so thank you for the recommendation. I thought that I'd attempted it before but I must have tried one of his less introductory papers by mistake.

Of course I'm duly pleased that he expresses everything in plain ol' code.

Hm

Refering to typed lambda calculus as "plain ol' code" may be a sign that I need to re-examine some of my prejudices about formalisms. :-)

Sorry

I didn't mean to cut you off. (I actually think the more explanations the better, because I never understand anything the first time.)

No prob

Perfectly alright. Since I was quite busy all week it was great that others provided the explanation I promised to get back to.

Not just sequencing

There is actually no requirement that monads sequence things in any way, it is just one of the most common ways monads are used.

For examples of monads which do not 'sequence' their arguments, see (in the ghc haskell libraries)

Control.Monad.Reader which distributes a value to all of its sub-computations but otherwise evaluates them with normal lazy semantics.

or
Control.Monad.Writer which does something similar, but collects an extra value from its subcomputation and lazily combines them.

or for the grand finaly.

Control.Monad.Identity which is the identity monad. Monadic code run in this monad behaved IDENTICALLY to if you had written it without any monads at all. (monad errors are translated to bottom). This is a very useful monad indeed.

I am unsure whether learning about monads as general constructs, one thing (of many) which they will let you do is sequencing will help or hurt the new-to-haskell programmer.

Re: yes, sequencing

(I just noticed the above post. For posterity's sake, I don't want to let it be the final word.)

For examples of monads which do not 'sequence' their arguments, see

Monads do not have "arguments." However, computations in a monad are always sequenced. This is exactly what the >>= and >> combinators are for: sequencing computations.

Control.Monad.Reader which distributes a value to all of its sub-computations but otherwise evaluates them with normal lazy semantics. or Control.Monad.Writer which does something similar, but collects an extra value from its subcomputation and lazily combines them.

These examples absolutely involve sequencing. For example, the following computation:

doStuff :: Writer [String] ()
doStuff = do tell ["message 1"]
             tell ["message 2"]
             tell ["message 3"]

produces the following interaction:

*Main> snd $ runWriter $ doStuff
["message 1","message 2","message 3"]

The result list is in that order because -- as always -- the sub-computations in the monad are performed in sequence.

I am unsure whether learning about monads as general constructs, one thing (of many) which they will let you do is sequencing will help or hurt the new-to-haskell programmer.

I'm having trouble parsing that sentence, but I think you're questioning whether it's helpful for new Haskell programmers to think of monads in terms of sequencing. I don't know how else you'd like people to think of it. Do you want them to think about domains and Kleisli triples?

Simon Peyton Jones likes to describe Haskell as his favorite imperative programming language, and this is the theme of Imperative Functional Programming, that monads are a way of embedding imperative code, i.e. sequences of instructions, in the middle of (lazy) functional code.

fustigate (v.) 1. to beat with a stick; 2. to criticize severely

Roughly, a monad is commutative if, when both sides make sense:
do { x <- m; y <- n; p }  =  do { y <- n; x <- m; p }
By this criterion, the reader monad is commutative: it doesn't matter what order you read things in. The usual writer monad, however, is not commutative, and neither is the state transformer monad for most state types. A very important monad that is commutative is the substitution monad for an algebraic theory.
However, computations in a monad are always sequenced.

They are not sequenced in a commutative monad. In particular, a compiler which could prove that a monad is commutative would be free to reorder computations, because it is not an observable property.

Simon Peyton Jones likes to describe Haskell as his favorite imperative programming language, and this is the theme of Imperative Functional Programming, that monads are a way of embedding imperative code, i.e. sequences of instructions, in the middle of (lazy) functional code.

If you look at Wearing the hair shirt: a retrospective on Haskell, also by Simon PJ, you will see that one of the things he would like for Haskell is a way to distinguish commutative monads from non-commutative ones, precisely because the current scheme overly sequentializes things.

(BTW, another thing he says is that "our biggest mistake [in designing Haskell was u]sing the scary term "monad" rather than "warm fuzzy thing". :)

linearity, then

They are not sequenced in a commutative monad.

I stand corrected. Let's refine my explanation, then. Sequentiality is still critical for all non-commutative monads, because basically you're hosed if you can't guarantee that your effects will occur in the proper order, no matter what (non-commutative) effects you're talking about.

For all kinds of effects, both commutative and non-commutative, it seems linearity is the common element: even for the "fresh value" monad (e.g., gensym), where it doesn't matter if expression A gets g257 and expression B gets g258 or vice versa, just so long as each gensym occurs exactly once.

(BTW, another thing he says is that "our biggest mistake [in designing Haskell was u]sing the scary term "monad" rather than "warm fuzzy thing". :)

My friend wants to start a line of Haskell-inspired sleepwear known as Simon PJ's. I can only assume they'll be warm and fuzzy.

Damn you beat me to it...

Hmm, I saw the error too. However, from my layman's perspective - I wondered about the following. Is it the commutative property or the operational semantics of a monad (the IO monad in particular) which enforces the sequential operation of the composition?

I couldn't help but notice that in the Jones/Wadler paper they forgot to define a "run" function (yeah, I know it is impossible to define explicitly) but they just assumed the operational semantics to be - first do program A then give its output o to B and then perform program B.

Semantically, there is nothing, however, which enforces that program A is run to completion before program B. Worse, given the lazy semantics of Haskell, I think I can define a program A which returns a (lazy/partial) list of values as output o which can be interpreted by B which may force the rest of the evaluation of A. I.e. there is nothing in the monad which enforces the sequential evaluation of the programs A and B; rather, it seems to be a property of stringing together strict programs...

Right?

Note: I removed the most obvious typos.

"The source of all type errors are type systems."

Is it the commutative property or the operational semantics of a monad (the IO monad in particular) which enforces the sequential operation of the composition?

The former. For example, the IO monad in GHC also includes an operation forkIO.

Where are you going with that fork!

First, no idea what you meant with that forkIO remark. Please elaborate if you still feel like. (I don't)

Uhm, second, I should have phrased my remarks better.
It should have been stated something like:

Even if the IO monad satisfies all monadic laws and the monad is linear (non-commutative) is the correct operation of the IO monad not a feature of the operational workings of some implementation on an abstract signature instead of a feature of the algebraic laws that that signature satisfies?

Or something like that...

Whatever...

Maybe we should just conclude that IO monads are there to do IO stuff in a type-safe manner and leave it there? This is getting boring and I don't have the feeling anyone is interested. (I am not)

Naive answer

You have to evaluate

x = 2 * 3

and

y = 7 + 4

before you can evaluate

z = x * y

and I would be quite surprised to learn that this was a feature of "the operational workings of some implementation" that evaluated those expressions. My understanding of the matter is that both CPS and (non-commutative) monads exploit dependencies of this kind in order to impose/express sequencing of evaluation. Also that monads, at least, do so via their type signatures, which define the dependencies that must be satisfied for the evaluation of each expression.

I am not that sure too, however...

Program A does IO and returns a lazy list; program B takes it as input; while B evaluates that list it runs A to completion which does more IO. Thus the evaluation of A and B are intertwinned/not-sequential (to the outside world).

Most monads, over a hash-map for instance, when run, will be lazily evaluated (for example, inserting a key/value pair is "run-to-completion" on a map only after you ask for that certain key) in a lazy language like Haskell. It just doesn't matter, a programmer doesn't really notice, since it is a referentially transparent program.

However, as a concrete instance of the contrived example: what happens if I had a program which returns all names of files in a directory as program A, and a program B which erases all files in that directory with the inverse name. If the directory contains files "ab" and "ba" is it going to be empty afterwards or not? If A is strict, yes, if A is not, the directory might contain the file "ab".

Thus, sequential composition need not enforce sequential evaluation in general.

This is just a technical point with regard to the sequencing remarks. *yawn* boring! duh, grrr.

The naivety continues

I think the two possible behaviours for Program A and Program B actually look different in code.

Example one: the first monad runs to completion, constructing a complete list in the process that is then consumed by the second:

main :: IO ()
main = do fileDescriptors <- getFileDescriptors;
          deleteInverses fileDescriptors

getFileDescriptors :: IO [FileDescriptor]
deleteInverses :: [FileDescriptor] -> IO ()

Example two: a lazy list of monadic values is constructed and (presumably lazily) consumed by a fold:

main :: IO ()
main = deleteInverses getFileDescriptors

getFileDescriptors :: [IO FileDescriptor]
deleteInverses :: [IO FileDescriptor] -> IO ()

In the first example, deleteInverses must run the monad getFileDescriptors in order to get a complete list of FileDescriptors; in the second, it must run each of a lazy list of monads to get each of the FileDescriptors it will use. There is no way it can consume a FileDescriptor without running some monad, and hence no way it can operate on a lazy list of FileDescriptors: the type of an IO action consed to an IO action is [IO a], not IO [a].

Cogito ergo sum.

no idea what you meant with that forkIO remark.

You asked (or I understood you as asking) if there was anything intrinsic about monads which ensure that side-effects of operations are sequenced. My answer was, no, there is nothing, it depends on the monad. As a counterexample to the idea that side-effects are sequenced in every monad, I gave the IO monad, which supports an operation forkIO. Since forkIO creates a new process running in parallel with the current execution thread, one which may have side-effects, it is clear that IO is an example of a monad that does not sequence side-effects.

Je pense donc j'ai mal à la tête.

Having begun a whole discussion by saying "monads are all about sequencing!" and then having lots of people say, "no they're not!" I hope I'm not being stubborn, but it still seems to me that sequencing is very relevant. When you say

IO is an example of a monad that does not sequence side-effects

I don't quite buy it. Sure, not all IO effects are sequenced, since you can branch from the main "trunk" and the effects from multiple branches can be interleaved. But within each branch, the effects are in fact still sequenced. And the fact is that the world-passing style enforced by the IO monad still guarantees that sequencing. I mean, that's the whole point.

At least when you're using the unit/bind formulation, computations are chained together in an implicit sequence, since the result of one sub-computation is piped to the next sub-computation. If you have other operations, like forkIO, that allow you to create new orderings of operations, then the monad may not guarantee a unique ordering for all sub-computations, but there still are some sub-computations that have to happen before others.

Also, when people say monads can be used as a framework for talking about all kinds of effects, that sounds like voodoo. Explaining that the common element is talking about the ordering of the effects helps demystify it.

I suppose a fork's out of the question

The kind of sequencing introduced by a monad specifies that one thing must happen (be evaluated) before another can happen (be evaluated), but it doesn't rule out the interleaving of other things happening (being evaluated) at the same time, and not necessarily in any strict order either.

I kind of like the idea of "a framework for talking about all kinds of effects" - I find I want to understand it better.

It is hopeless...

really, if even very seasoned programmers can't understand it, or have a very hard time wrapping their head around it, then it is clearly a badly designed language feature.

Yes, monads are quite elegant from a pure theory POV, but if there ever was a feature that will prevent pure functional programming from going mainstream, this is it.

It doesn't help that even if you "get" monads, that in everyday usage even making small mistakes will spawn the most horific type errors from most implementations, at least last time I checked.

Give up. Go find a more intuitive solution for dealing with state. Yeah I know it is hard, sorry.

Paradigm Shift

Wouter: really, if even very seasoned programmers can't understand it, or have a very hard time wrapping their head around it, then it is clearly a badly designed language feature.

Poppycock. If these "very seasoned programmers" are only seasoned in C/C++, then there's a lot of stuff in Haskell that will blow their minds for a while. So what? That's the nature of a paradigm shift. In my experience, both with myself and many colleagues over the years, there's a certain amount of weeping, wailing, and gnashing of teeth anytime a new paradigm is used on a project, even if the language doesn't change—for example, I introduced boost::bind and boost::tuples on a C++ project recently, and still get questions about them. Good thing I didn't use boost::lambda, boost::signals, or boost::spirit...

This attitude is the thing I hate most about our industry: the notion that, hey, I already know enough; quit asking me to learn something new!

Wouter: Yes, monads are quite elegant from a pure theory POV, but if there ever was a feature that will prevent pure functional programming from going mainstream, this is it.

Monads aren't just elegent from a pure theory POV, although they certainly are that. They're also absolutely necessary if you wish to extend your use of specific functional strategies across certain boundaries. What I do agree could make a difference to the user of a language is whether, as a programmer, you must identify where one or more monads are necessary, arrange for their construction, apply any necessary monad transformers (which don't compose in the general case, which is my principle beef with monads), and so on; or whether the language might automagically ensure that code that has side-effects or does I/O, or both, is handled monadically, including the application of transformers that are known to compose in the necessary cases. In other words, how far can we take type inference in the service of automating some of the awkward squad?

Wouter: It doesn't help that even if you "get" monads, that in everyday usage even making small mistakes will spawn the most horific type errors from most implementations, at least last time I checked.

This is the usual complaint about an implementation rather than a concept. Helium is deliberately designed to do a much better job at this than most Haskell implementations, although it does so partially by sacrificing features. No doubt it and other implementations will improve over time.

I would also observe that using templates in C++ is absolutely no better. Perhaps the point is that we need something better than spitting out long strings of text for presenting errors in the context of a rich type system.

Wouter: Give up. Go find a more intuitive solution for dealing with state. Yeah I know it is hard, sorry.

Give up. Go learn enough Category Theory to be able to cope with monads. Yeah I know it is hard, sorry.

Difficulty of understanding monads overrated

Much as I struggle with CT, having neglected to struggle at sufficient length with the things CT is about, I don't think that understanding the way monads work in Haskell is particularly hard. Hutton and Meijer's tutorial on monadic parser combinators was a good way in for me, as it built up the monad combinators from scratch and showed how they were used to combine parsing operations. After getting through that, I felt I had my feet on solid ground.

You can use the monad syntax in Haskell without even understanding (or having once understood, then forgotten) that much. I admit that I still think of monads as being really "about" sequencing and threaded state, but as a way of thinking about the non-commutative monads that's good enough to be getting on with.

I too think that automagical monadification of side-effecting code would make life easier for the poor toiling programmer, and wonder whether anything of the sort has been tried. I would also note that monad transformers are not the last word in composing monads, as this paper for instance suggests.

Stones

I don't see what category theory has anything to do with this subject except that it confuses the subject even more. I don't think Wouter ever needed it when he defined any of his languages - and, if anyone, I guess he is the expert on languages.

Also, you don't need category theory to understand the monads of Haskell!!! They have perfectly clear operational or axiomatic or (who cares) van Wijngaarden semantics.

You know, one of the most elegant solutions I have ever seen which cleanly deals with side-effects is to simply make a distinction between functions (no side-effects) and procedures (may have side-effects). Functions may use other functions, and procedures may use other functions or procedures.

It is easy, there is no need for category theory, and everybody understands it. (Yeah, yeah, the category theorist will now stand up and categorically claim: but a procedure then has an IO monadic type! But, then assembly clearly is lambda calculus too, ain't it...)

PS: I am not that against category theory, I just wonder whether it is the right abstraction used in this case.

Ha!

You know, one of the most elegant solutions I have ever seen which cleanly deals with side-effects is to simply make a distinction between functions (no side-effects) and procedures (may have side-effects). Functions may use other functions, and procedures may use other functions or procedures.

Oddly enough, you can formalize this approach using modal logic.

There is no escape from crazy theory. :)

Nice idea

Actually, I like the idea. Since modal logic is logic w.r.t. worlds, and functional IO changes worlds, it seems pretty natural to express it in a modal logic.

Well, uhm, thanks ;-)

The logic of monads is

The logic of monads is modal, so perhaps this isn't so odd...see for example this paper by Kobayashi (couldn't find a freely accessible version), and this proof-theoretic treatment by Pfenning and Davies. For those (like me) who find this level of theory a bit too crazy, it's worth checking out several of the papers by Aleksandar Nanevski.

That's how I think

You know, one of the most elegant solutions I have ever seen which cleanly deals with side-effects is to simply make a distinction between functions (no side-effects) and procedures (may have side-effects). Functions may use other functions, and procedures may use other functions or procedures.

Actually, that is how I usually think of the IO Monad and the rest of Haskell. [non IO functions] may use other [non IO functions], and [IO functions] may use other [non IO functions] or [IO functions].

if only

A paradigm shift? that is too easy an excuse. Compare monads with another feature that requires a C++ programmer to rethink what they know, say lazyness. Lazyness is a clean, very transparent and simple construct, that will have a smooth learning curve, even for a C++ programmer. Once you understand lazyness it becomes so solid you don't need extra thought to use it. Not so monads, they are not transparent at all, not simple, and have a learning curve like mount everest.

Really, just because it is really different doesn't make it a new paradigm. Yes people are scared of learning new languages/features, and I wish it weren't so. But with monads it isn't their fault. If you think people should just persist a little longer it is time to step out of your small circle.

You Say "Pair-uh-dime;" I say "Pahra-dig-um..."

Wouter: A paradigm shift? that is too easy an excuse. Compare monads with another feature that requires a C++ programmer to rethink what they know, say lazyness. Lazyness is a clean, very transparent and simple construct, that will have a smooth learning curve, even for a C++ programmer. Once you understand lazyness it becomes so solid you don't need extra thought to use it. Not so monads, they are not transparent at all, not simple, and have a learning curve like mount everest.

Since we're now relying on anecdotal support for our arguments, I see your point and raise you with concrete recent experience using tools like boost::bind and boost::lambda to achieve laziness in side-effecting references. Laziness does not have a natural implementation in C++, not surprisingly, because C++ is eagerly evaluated. So you have to be careful to put boost::ref() around references that you wish to remain references in bind or lambda expressions. Basically, it's all about using nullary functors instead of "the objects themselves," and arranging for operator() to be called when you need the functor to be evaluated. And there's never a time in C++ that I get to ignore that, unless I'm writing purely functional code where order of evaluation literally doesn't matter. I'd be very interested in seeing such C++ code: I never have, and I never expect to.

Wouter: Really, just because it is really different doesn't make it a new paradigm.

Except when it does, of course. It's generally accepted in computer science that functional programming is a different paradigm than imperative or object-oriented programming. Texts such as SICP and CTM certainly treat them as distinct paradigms or concepts. Personally speaking, I find the way that I have to think about programming to be sufficiently different among them as to rise to the level of "different paradigms." But as the title jokingly points out, unless we're willing to get into formal definitions of our terms, it's just a matter of taste.

Wouter: Yes people are scared of learning new languages/features, and I wish it weren't so. But with monads it isn't their fault. If you think people should just persist a little longer it is time to step out of your small circle.

You're a bit quick with the ad-hominems for my taste; you presume to know the size of my circle of something, probably programmers that I know, something that I assure you you're ignorant of. Nor are you in an omniscient position with respect to who is able and who is not able to understand monads. I would propose that the existence of this thread, and the increasing number of explorations of monads in languages in which they are not a first-class concept, is suggestive both of their utility, even outside the context of a "purely functional language," and of their understandability, perhaps also as a function of their divorce from their specific implementation in Haskell. Personally, I wasn't able to get a handle on them until I saw implementations in Scheme and O'Caml in rapid succession. Perhaps the comparison between a latently-typed implementation and an implementation in terms of ML modules and types clarified things for me. In any case, I had a powerful learning experience, and I'm loathe to believe that such an experience would somehow be denied to other reasonably intelligent programmers by anything that's inherent to monads that isn't also inherent to higher-order functions.

Of course, that begs the question as to how understandable higher-order functions are.

sorry

for the ad-hominems, I will try to be more ad-verbum :) My impression of your "circle" comes from what you say, and saying things along the lines of "just go learn category theory" doesn't fit into the same real world picture that I have.

I have no omniscient knowledge about what programmers are like. But I teach classes to graduate compsci students almost daily... seeing what to them is difficult and what is not is really interesting. Even though these guys are probably brighter than the "average programmer", asking them to learn category theory just to allow them to be able to cope with underlying complexity of something that is the simplest thing to them right now, sequencing stateful operations, will be very surreal to them. And if these guys won't get it, then lets not even start about the armies of VB/Cobol/Perl/JavaScript/... programmers out there.

And no this would not be different if they had all started out on Scheme or whatever... fundamentally you have a language feature with simple functionality but complex underlying semantic details (== bad).

More Verbiage

Wouter: sorry for the ad-hominems, I will try to be more ad-verbum :) My impression of your "circle" comes from what you say, and saying things along the lines of "just go learn category theory" doesn't fit into the same real world picture that I have.

OK, that's quite fair. Let's recall that my "learn enough category theory" was just a structural edit of your "Give up..." The point wasn't so much that it's necessary to have a mastery of category theory to grasp monads as that there's a fork in the road: you can give up and say "monads are hard," or you can dig in and say "why do monads seem hard?" My personal experience is that they were hard as long as I only examined them in the context of my (limited) exposure to Haskell. The Scheme implementation and O'Caml implementation that I've seen both seem quite understandable given a multi-decade grounding in higher-order functions, which is why I wrote earlier that I don't see anything intrinsic to monads that isn't intrinsic to higher-order functions, and that that only begs the question as to how understandable higher-order functions are.

But that question is a serious question, especially given recent history on my work project in which I find myself explaining and re-explaining boost::bind and boost::lambda to colleagues. It's also why I do maintain that object-oriented, imperative, and functional programming are, in fact, different paradigms, and C++ is a multi-paradigm language, with the functional aspects still somewhat mysterious and underutilized despite the standardization of the STL and the availability of high-quality FP libraries such as FC++, boost::lambda, and boost::spirit::phoenix. They make my life easier, but at least at the moment, they appear to make some of my colleagues' lives harder.

Not really different.

Personally speaking, I find the way that I have to think about programming to be sufficiently different among them as to rise to the level of "different paradigms." But as the title jokingly points out, unless we're willing to get into formal definitions of our terms, it's just a matter of taste.

I don't think that imperative, functional and object-oriented programming are sufficiently different to allow for such categorization. They certainly don't make me think differently. I use higher order programming in imperative languages every day, and I organize my code around modules I call 'objects'...

Good metaphors

Lazyness is a clean, very transparent and simple construct...Not so monads

I have to say I'm surprised at this. On a superficial level, monads are easy to use by invoking the crude metaphor of a sequencer, but the full theory behind them is more complex.

On the other hand, laziness has fairly simple theoretic properties, but drastically alters how you reason about programs, especially in terms of performance.

I suspect that someone used to eager evaluation has a much harder time actually getting useful work out of laziness than out of a monad.

The only advantage that I think laziness has over monads is that they are identified using a helpful metaphor. Laziness is a nice anthropomorphic term (pace Dijkstra ;-)) that gives us a viceral idea of what it does.

Though "warm fuzzy thing" is certainly appealing ;-), perhaps calling non-commutative monads "sequencers" and commutative monads "summers" or "joiners" might have made things easy for people who mainly want to program without confusing or cheating those who wanted to go beyond into the theoretical nitty-gritty that unified them.

As an analogy, PL semanticists think about fix-point operators, but programmers don't have to know about these: they just use recursion and have to understand convergence to a base case.

Wouter, if you think I still haven't covered some property of monads that makes them harder to learn or work with, I'm interested to know. I don't think we can make progress without figuring out what causes a feature to be more difficult.

that would be its non-transparant nature

If I am a C++ programmers, and someone shows me the Haskell "do" syntax, I think, hey, that is easy, it is just like a statement list. Until I actually try and use it like that. I will create a Monad, maybe a bit weird, but ok, some examples, and I manage to create something. Now I write my do "statements". Something doesn't match up, and I get the weirdest type errors. For a C++ programmers, these are complete chinese to me. To have an instant understanding of these type errors, I need to be so deep into functional languages to the extend that only people on this board are.

Yes I know, implementation vs design. But we've had Monads for a very long time now, and still nothing has changed. To me that says something that the problem is more fundamental. Monads could be made a language feature, but frankly that would make a lot of things more ugly in favor of some small improvements in usability.

What Monads appear to represent (simple sequencing), and how they actually work internally, are quite far apart. It is not wysiwyg at all. Good language features have a clear 1:1 relationship between syntax and semantics.

But we've had Monads for a ve

But we've had Monads for a very long time now, and still nothing has changed.
Is 10-15 years really a very long time? Given that there's only a few languages that use monads for sequencing, and that these languages are used by a minority of programmers, I'd say your remark is unfair.
[Monads are] not wysiwyg at all. Good language features have a clear 1:1 relationship between syntax and semantics.
I don't think there's any way of supporting this claim without resorting to "human factors" (no, don't go there). For example: how can you match first-class continuations, which introduces non-locality to the flow of control, to any syntax that is tree-based? Even the simpler lexical closures serve as an example. The scope is captured by a closure and used later, non-locally: Syntax alone does not describe completely what closures mean. Try to justify syntax for lazy languages. Try to justify syntax for concurrent programs.

Syntax is fundamentally different from semantics. (And, syntax issues are mostly irrelevant) This "clear 1:1 relationship" is not clear at all beyond the simplest examples, and I think even these are based on past experience and not any objective measure.

You must walk before you can run...

If I am a C++ programmers, and someone shows me the Haskell "do" syntax, I think, hey, that is easy, it is just like a statement list.

The one catch with this example is that you have chosen a programmer only familiar with a non-FP language, without requiring them to understand FP basics.

Monads are mainly motivated by an FP style, so if one hasn't grokked how a pure FP environment works and why you might want to use such a thing, it seems pretty unlikely you are going to be using a language with monads anyway.

So this wouldn't be a problem with monads, but a problem with a more basic idea upon which monads depend.

What if we didn't know it was a monad?

Wouter wrote:

A paradigm shift? that is too easy an excuse. Compare monads with another feature that requires a C++ programmer to rethink what they know, say lazyness. Lazyness is a clean, very transparent and simple construct, that will have a smooth learning curve, even for a C++ programmer. Once you understand lazyness it becomes so solid you don't need extra thought to use it. Not so monads, they are not transparent at all, not simple, and have a learning curve like mount everest.
At first I thought this was a joke, since what I usually hear from "C++ programmers" is that laziness is hard and unpredictable, and the sequencing is transparent and simple.

One of the most ubiquitous monads of all time serves as an existence proof that C++ programmers can understand monads. Consider the Unix command line. Most commands have the ability to take a filename as an argument and produce their output to stdout; restrict your attention to them. There is a command cat that produces the contents of a file on stdout, and an operator | that pipes the output from one command to another. | and cat obey the "monad laws" (Unix programmers know this and use it to optimize complex command pipelines):

  cat filename | cmd  =  cmd filename
           cmd | cat  =  cmd
(cmd0 | cmd1) | cmd2  =  cmd0 | (cmd1 /dev/stdin | cmd2)

puh-leeze

really, if even very seasoned programmers can't understand it, or have a very hard time wrapping their head around it, then it is clearly a badly designed language feature.

Everyone says this about everything new, but high school students do calculus and 2-year-olds operate computers. New ideas start out hard until people figure out how to make them understandable. To quote Conor McBride, "the task of the academic is not to scale great intellectual mountains, but to flatten them."

Give up. Go find a more intuitive solution for dealing with state. Yeah I know it is hard, sorry.

Give it time. People have only just started trying to figure out how to explain this stuff. Phil Wadler solved the problem spectacularly for the research community, and now people are attempting to solve it for the wider community.

Remember way back when?

Go find a more intuitive solution for dealing with state. Yeah I know it is hard, sorry.

Well, I agree.

What I mean is, the way monads are treated in Haskell is not the best way to deal with state and effects. However, monads themselves are a very important concept, and should play a bigger role in future languages.

We already know that monads are not the best solution for handling effects. Let me quote from Park and Harper's A Logical View of Effects:

Our view is that monads are an abstraction that is more general than effects and that monads are a particular way to model effects. The first is based upon the fact that many datatypes that do not involve effects organize themselves into monads. For instance, it is easy to create a list monad by instantiating the type class Monad of Haskell. The second is based upon the observation that monads are not the only way to model effects. For instance, Nanevski [28] shows how to usefully model exceptions with comonads. Thus we are led to conclude that monads are not identified with effects and the study of effects is more fundamental than the study of monads. Plotkin and Power [36] present a similar view: "computational effects determine monads but are not identified with monads."

The second citation points to one of a series of papers by Plotkin and Power (and Hyland) which study effects algebraically; you can find the rest on Plotkin's page. (Disclaimer: I haven't read them yet.)

So every effect determines a monad, but not every monad determines an effect. Another (worse) way to say this is that monads are more general than effects. So if you approach monads from the perspective of effects, and try to understand what the monad laws have to do with effects, you are already going down the wrong path. This is one reason they may be hard to understand.

There are other reasons too. Let me try to suggest them.

First, monads in Haskell are a convention, not a language feature. There is syntax for them, and a class in the standard library, but the language does not guarantee that an instance of class Monad actually is a monad (i.e., satisfies the monad laws). In this respect they differ from datatypes; if you declare a datatype T, then you know that T actually satisfies the laws for an initial algebra. (Well, most of the time. :)

In my experience, programmers have more difficulty with program concepts that are only language properties and not language features. Because Haskell provides syntax, primitives and standard definitions which are intended to be used with monads, it gives the illusion that they are a language feature. But actually, programming with monads in Haskell is a bit like programming with closures in C: you can do it, by following a certain style, but C is not really the best language for it. (Don't read too much into this analogy, though.)

Second, I think there is too much emphasis in the literature on the notion of monads themselves versus specific monads. A monad is a very general thing, and not of much use in the abstract. Let me give you an example which will illustrate this. In the typed lambda-calculus, all of the major types and operations ((a ->), products, sums, datatypes) are given by adjoint functors, and every adjoint determines a monad. Yet they are conceptually very different, obviously. Don't get me wrong: the fact that they are adjoints is important, but not as important as the particulars of each adjoint and, hence, each monad.

Third, if you look at the categorical literature, you will see that the most important use of monads is for defining algebraic theories. In contrast, the most important use of monads in Haskell is for structuring effects. In fact, I think it was mostly unknown until Moggi's paper that monads had anything to do with computational effects. (A possible exception is the partiality monad.) Consequently, it is hard to apply the categorical literature to writing monadic programs.

I think this will change in the future, though I don't yet know what form it will take. Instead of using monads for effects, we will use them to define algebraic theories. In programming terms, this means "domain-specific languages". (Monads are used for this purpose now too, but mainly to deal with effectful languages; in the categorical literature, monads are used to encode the laws of a language: the reduction relation on terms.) In particular, monads have something to do with eval/quote, and you can see this already in MetaML. (They also have something to do with my isomorphisms stuff.)

So, having said that monads are not the best way to handle effects, I nevertheless believe that they will play an important role in the future, for other reasons. The syntax will be different. Probably it will be easier to understand, because the role they will play will be something which is more connected to essential features of all monads rather than the particular class that have to do with effects.

The last thing I want to say is that I think this statement of Wouter's is a bit off the mark:

if even very seasoned programmers can't understand it, or have a very hard time wrapping their head around it, then it is clearly a badly designed language feature.

You do not have to understand a feature to use it. The truth is we use features we don't understand all the time. For example, how many Haskell programmers can actually state the equational laws that Haskell programs satisfy? Probably not many. How many Java programmers actually understand classes and interfaces? Hardly any. They understand the general idea, but could not give you a formal sketch of it. What about state or laziness? Many people understand them in outline, but do they actually know what rules are sound for reasoning about them?

State, especially, is one of those things which we all have a very shallow understanding of; you can explain it using a store, and fetch/put operations, etc. But there is a big difference between that explanation and "the Big Picture": it's possible to grasp a definition without understanding all its consequences. In fact, it's so common that we developed the notion of theorems and proof to deal with it. That few people "understand" state is evidenced by the fact that it still frequently bites us in the ass.

With monads, perhaps, it is the other way around: they're easy to use, but hard to define. Yes, I don't actually believe that it is so hard to use monads. Once you get the difference between a computation and a value, and understand that "there's no way to turn IO a into a", it's pretty easy, at least if you restrict yourself to the do-syntax.

Sure, people coming from C++ will have trouble if you just throw them into Haskell and say, "OK, now if you wanna do stuff like in C++, you should use monads." Of course they'll be confused! The differences between C++ and Haskell are much more fundamental, and you cannot expect to sweep them under the rug like that.

(Wouter also complains about type errors in using monads, but this is partly because of the way monads in Haskell are tied to type classes, and anyway dealing with type errors is something a Haskell programmer learns to do in many other contexts as well.)

What I mean about it being the "other way around" is this: compare it with state. It's easy to define state in terms of a store, but it's deceptively easy: you don't see the problems until they bite you because it all goes on behind the scenes. With stateful monads in Haskell, though, the problems tend to stare you in the face, and you are forced to deal with them, explicitly. I am not saying here that one approach is better than the other; what I'm saying is that people who are not used to having all the consequences of using state thrown in their face at once might find it daunting. I mean, when people talk about "dealing" with state what they usually mean is debugging. :) (Don't talk about objects; objects leak state like a sieve.)

Here is a (not very good) analogy. The difference between explicit monadic state and implicit imperative state is like the difference between breadth-first and depth-first search. With BFS, you see all the options at once, but there might be so many that it is hard to cope; with DFS, you are blind and have to do a lot of backtracking, but it's less daunting because you see so little at once. (I know it's hopeless to say this, but: you should not read too much into this analogy.)

Better example

For example, how many Haskell programmers can actually state the equational laws that Haskell programs satisfy?

This was not a good example. A better example is things like POSIX primitives. Most people don't really understand buffering, or sockets, or whatever. They experiment with them, and eventually get a sense of how they work. Occasionally they make errors, and (if they figure out a workaround) their understanding of the system deepens. But ask them, "What is file I/O?" and I don't think they could give you anything but the most general answer. Similarly, answering "What is a monad?" is probably not going to be useful information for most programmers.

Of course, I think that it should be useful information, and programmers should know how to exploit it. And I think the lack of a formal specification for POSIX sucks.

My point is rather that you are holding monads to a much higher standard than you would hold POSIX. You want an answer to a mathematical question, but you would not demand a mathematical specification of POSIX I/O. (I would, but that's a different story. :)

We already know that monads

We already know that monads are not the best solution for handling effects. Let me quote from Park and Harper's A Logical View of Effects:

Thanks for the link! I can't wait to read this.

First, monads in Haskell are a convention, not a language feature. There is syntax for them, and a class in the standard library, but the language does not guarantee that an instance of class Monad actually is a monad (i.e., satisfies the monad laws).

I agree. I think another stumbling block is that it's hard to give a concrete answer in the context of Haskell of what the monad actually is. First of all, it's hard to think of a type constructor and two operations as a thing, and it sure isn't intuitive for a newcomer. Second of all, since the monad is really a semantic object, it's actually pretty subtle to distinguish the mathematical entity from the type class, the instance, and the implementation code. I agree that people can use technology without a deep understanding of it (who was it who said that famous quote about formal methods?), but I kind of think when Simon makes his joke about "warm fuzzy things" that it's not entirely a joke: maybe they should have called it a "design pattern" or something, because when someone is trying to grok Haskell and wants to know "what is a monad?" it's hard to give them a one-sentence summary that says, "a monad is X."

Instead of using monads for effects, we will use them to define algebraic theories. In programming terms, this means "domain-specific languages".

I don't know enough of the theory yet to really know what you mean, but I think I like this idea. My reaction to e.g. the State monad in Haskell has been to wonder why people are so excited about writing an interpreter for a useful language feature every time they want to use it. If the feature belongs in the language, it should just be in the language, right? Go ahead and use monads in the semantics, but why keep reinventing the wheel? But maybe DSL's are the meeting ground: leave the feature out of the general purpose language to prevent people from overusing it, and use monads to build up a DSL with nice algebraic properties when you need the feature often enough for a particular problem domain. Still leaves a lot to taste, but there might be a story to tell there.

This is exactly what I do

I kind of think when Simon makes his joke about "warm fuzzy things" that it's not entirely a joke: maybe they should have called it a "design pattern" or something[...]

That's exactly what I do these days. A design pattern really is just a moderately tricky algebraic structure, just like a monad. It's just that programmers think that they can understand a design pattern, whereas for some inexplicable reason they tend to get confused if you say "a monad is a composition of adjoint functors". ;)

What are the superior alternatives to monads?

The papers on Lawvere theories and modal logic are fascinating, but I don't understand them well enough to compare them to monads. Would it be possible to outline how these methods are different and why some are better than others? How might these differences manifest themselvs in actual programming languages? Is it possible to use these techniques in current languages, e.g. Haskell?

Where do John Hughes's arrows fit into this?

Monads in Perl

FWIW, I wrote a little tutorial that attempts to explain monads by implementing them in perl. Things became much clearer to me once I had divorced the idea of monads from their concrete representation in Haskell. Heck, once I figured monads out, I even dabbled with monadic parser combinators in perl.

yamt

or, yet another monad tutorial... :)

just about the IO monad in Haskell, but in a concise writing style refreshingly distant from the likes of "A Gentle Introduction to Haskell"...

i found it to be pretty instructive:

http://haskell.org/haskellwiki/IO_inside

in particular, this little section:

http://haskell.org/haskellwiki/IO_inside#What_is_a_monad.3F

Do Haskell compilers treat the IO monad specially?

Do Haskell compilers treat the IO monad in a special way? If I write my own monad, would it be optimized in the same manner the IO monad is optimized? is it a requirement for a Haskell compiler to treat the IO monad in a special way?

It's special in the sense

It's special in the sense that it can't be implemented in Haskell, but there's no requirement beyond that. OTOH, it's likely to come in for at least some optimisation, and the implementation is effectively 'magic'.

Nice tutorial.

It's impressive how monads sequence computations and turn a program into imperative style...functional programs start with referential integrity and slowly become imperative whereas imperative programs are turned to functional by compilers doing SSA analysis.

In the end, it seems both approaches end up with the same result!

Not quite true - monads

Not quite true - monads potentially allow more to be specified in the source code, it's easier to insist the presence of an effect be specified than its absence be proven. This isn't yet being used by Haskell compilers as much as would be nice, although when you do the analysis a surprising amount of 'monadic' code is pure scaffolding building things up and the compilers certainly do take advantage of the properties of pure functions.

You could have invented monads!

Sigfpe's superb monad tutorial was mentioned previously, but since this thread is linked from the Getting Started thread, let's mention it here as well:

You Could Have Invented Monads! (And Maybe You Already Have.)

Instead of invoking the fearsome specter of Category Theory and then hand-waving it away, sigfpe describes how monads arise naturally as straightforward solutions to certain types of problems. He even lets you derive them yourself, provided you don't scroll down too quickly.

Sorry for resurrecting this

Sorry for resurrecting this thread...

Philip Wadler has quite a number of papers on monads. Which one(s) should I start with?

I like this one, but

I like this one, but opinions may vary...

I find that one more

I find that one more enjoyable than the very similar one, Monads for Functional Programming so I recommend it as well.

Also, to add, it is completely acceptable even encouraged to "resurrect" old threads.

Thanks Ehud and Derek.

Thanks Ehud and Derek.

intuition for imperitively trained programmers

I like this one. It's in the same spirit as the "You Could Have Invented..." link above but is lower level:

In the monad way of doing things, instead of calling a magic function to perform output, you return values. Instead of calling a magic function to perform input, you return functions (almost callback-style) that can later be invoked with the new input. That's about all there is to it.

And, oh yeah, in this model: RAM is also treated like I/O. Programs don't directly touch RAM -- they output things to be stored and read back in things previously stored.

Then, when the imperitively trained programmer comes around to "But wait, isn't that incredibly awkward?" you can start talking about closures and continuations and other things that make it incredibly easy (and clean).

-t

Yet another tutorial.

James Iry has been writing an introduction to monads as series of articles in his blog. It is in Scala, something that may appeal to programmers jumping into the FP world coming from Java. I'm finding it quite clearly explained and well written.

Part 1
Part 2
Part 3
Part 4