Explaining monads

I am off to bed, so I'll get back to this tomorrow per Luke's request.

For the time being I recommend section 2 of Wadler's Monads for functional programming.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Still don't quite get it

I read the paper, and I thought I was beginning to understand what monads are. Then I lost it again. Monads seem to be presented as a way of avoiding side-effects. This doesn't seem to be true. From what I've seen they are merely a way of delaying side-effects until some later time. For instance, reading about the IO monad in Haskell gives me the impression that instead of actually performing I/O, you create a computation which when executed will perform the I/O. This computation is then returned as the value of the main value of the program. But then what happens? "And then a miracle occurs?" Presumably, this computation is then executed in an impure fashion? So rather than allowing Haskell to do I/O, this allows Haskell to create impure programs which do I/O and hand them off to some other language interpreter. Is this right? If so, isn't it a bit of a cop-out (a tacit acknowledgement that pure languages are of limited use in an impure world)? If not, what am I missing?

I can see the utility of using monads for simulating state, as a design pattern. This seems to work as you really can avoid side-effects this way. But I/O is side-effects, so there's no getting around it. So, what is the advantage of using monads for I/O? You eliminate side-effects from the Haskell program, so it is still nice and easy to reason about. But then don't you still have the problem of reasoning about what the monad will do when it is executed (outside of your program)?

sequencing

Monads seem to be presented as a way of avoiding side-effects. This doesn't seem to be true. From what I've seen they are merely a way of delaying side-effects until some later time.

Yes, they do delay the side effects until running the computation, but they also explicitly sequence the effects. That's the common feature of all monads: explicit sequencing (just like CPS).

For instance, reading about the IO monad in Haskell gives me the impression that instead of actually performing I/O, you create a computation which when executed will perform the I/O. This computation is then returned as the value of the main value of the program. But then what happens? "And then a miracle occurs?"

The "miracle" that occurs is no different from the miracle that occurs whenever any language has I/O: it always has to be implemented by some underlying system. That's not the problem the I/O monad is solving. With laziness, it's hard to know what order things will happen in. The I/O monad sequences the effects so you know they'll happen in the right order.

As for when it gets run, every monad comes with some sort of a "run" operation, which goes from Tα -> α. This is the only way to "get out of" the monad. All the monads in Haskell have some sort of library function for running the computation, except for the I/O monad. The only way to run the I/O monad is to run your entire program (from the command line, or the REPL, or whatever). So in this case, the monad's "run" operation corresponds to the program's "run" operation.

If so, isn't it a bit of a cop-out (a tacit acknowledgement that pure languages are of limited use in an impure world)? If not, what am I missing?

I don't like all the talk of "purity." I/O isn't bad, it's just hard to reason about in conjunction with a lazy language. Shutting all I/O into the monad means the monad laws will hold of all I/O computations.

I can see the utility of using monads for simulating state, as a design pattern. This seems to work as you really can avoid side-effects this way. But I/O is side-effects, so there's no getting around it. So, what is the advantage of using monads for I/O? You eliminate side-effects from the Haskell program, so it is still nice and easy to reason about. But then don't you still have the problem of reasoning about what the monad will do when it is executed (outside of your program)?

Again, it's not about eliminating I/O, it's about forcing it to obey certain algebraic laws.

Ahhh!

Thanks! Now, I think I finally understand.

IO violates RT

I don't like all the talk of "purity." I/O isn't bad, it's just hard to reason about in conjunction with a lazy language. Shutting all I/O into the monad means the monad laws will hold of all I/O computations.

It doesn't matter if a language is lazy or strict, invisible IO (and other invisible side-effects) violates referential transparency.

IO monads

So, let me get this straight - when running a Haskel program that uses IO monads - does it do the entire calculation and then only finally 'execute' the monad outputting everything in one go? or is there some clever trickery in the compiler/interpreter that lets it perform the IO 'as it goes along' ?

Trickery

Because IO obeys the algebraic monad laws, the compiler can safely transform functions that create IO actions into code that directly performs IO. This code then "drives" the evaluation order of pure lazy functional code.

There is a good paper about how IO monads are transformed and optimised in the GHC compiler, but I don't have the reference handy.

yes, sort of

Conceptually, the description "evaluate a function to get an IO program, then execute the IO program" is correct (with the caveat that the generated IO program is often infinitely large). Due to the lazy evaluation of Haskell, both parts are actually interleaved, so the operational description is more like "evaluate enough of the function to know the next IO step, execute the IO step, then go back to evaluating".

It's not laziness but functional abstraction

See here.

Meaning the monad constructs

Meaning the monad constructs an imperative program to do the IO.

See "All about monads"

http://www.nomaware.com/monads/html/index.html

This is the explanation that finally did it for me.

Don't get too hung up on the IO monad, because IO is a rather atypical special case. Most monads have a "run" function that takes a monadic action and executes/evaluates it (e.g. runST, runState, runCont). IO doesn't because the whole program is an IO action.

Within a monadic action (which is composed of smaller actions) side effects happen. To be more precise, the monad defines a context through which the effects propogate from one sub-action to the next. For example within the "State" monad the state value is passed along as a hidden parameter by the "bind" function. This hidden parameter is accessed by "get", and replaced by "set".

The ST monad is similar, except that instead of a fixed state value you have a dynamic heap to play with. You could almost simulate it in the State monad using a resizeable array and passing out index values, except that garbage collection wouldn't happen. Once you have constructed your ST action you call runST on it. From the outside runST is a function: it preserves referential transparency, and if you call it twice with the same arguments you are guaranteed to get the same result. On the inside you have side effects propogating through the ST implementation of "bind", so you can write imperative code.

IO is like ST, except that its not just the state of a heap that gets passed through "bind"; the state of the world gets passed along as well. There is no runIO function because the world cannot be passed around like a normal parameter. Instead you run the entire program.
(Of course the infamous "unsafePerformIO" looks a lot like a "runIO" function, but it doesn't preserve referential transparency).

Paul.

I have not yet found a better tutorial...

Than the one here. This is linked from the TUNES wiki node for Monads; please add useful links and information there.

Archive url

That domain seems to have changed owners, here is a link to the last archived version (archive.org).

It can be found on

It can be found on haskell.org now.

Clean world?

The paper talks about SML & Scheme as impure, and Haskell & Miranda as pure. No mention though of Clean's use of world to isolate side effects.

So where would Clean fall in this list? Are the world's in any way considered monads?

Clean has unique types

Every function that wants to change the world must get a world parameter and return the changed world. As the world type is linear the compiler will flag another usage of the old world:


-- * says the type is unique or linear
-- (i.e. can be consumed only once)
fwritec:: Char *File -> *File

WriteABC:: *File -> *File
WriteABC file = fwritec 'c' (fwritec 'b' (fwritec 'a' file))

--or
WriteABC file = file3
where
file1 = fwritec 'a' file
file2 = fwritec 'b' file1
file3 = fwritec 'c' file2

This chapter of the Clean 2.0 language report contains the relevant information.

Unique types are quite powerful and with them we can write the IO datatype in the language (instead of having it as a primitive) and it would be like the state monad.

Monads are a way to structure programs, in particular the IO monad in Haskell is a way to model IO. Unique types are another (orthogonal) way to structure programs, being used in Clean to model IO.

linearity and monads

Unique types are another (orthogonal) way to structure programs, being used in Clean to model IO.

The connection:

Imperative Functional Programming

Rolling Your Own Mutable ADT -- A Connection Between Linear Types and Monads

Concurrency?

Sometimes I think that discussion of Monads gets too intertwined with the Haskell syntax for it. Figure it doesn't hurt to look at it from another language which uses a different syntax & vocubalary. From the Clean perspective, I gather the Unique type is tied with the concept of sharing and locking of types.

Would it be fair to say that Monads are also tied to the issue of Concurrency? (Not just during runtime, but also compile time?)

Schemer's introduction

I don't mean to scoop you, Ehud, but ever since Luke's comments yesterday I've started working on a Lispy (OK, Schemey) introduction to monads. It's just a first draft, but my aim is to build the basic ideas from the ground up. It could use help, so suggestions greatly appreciated.

Luke, let me know if this is at all helpful. Don't be nice -- if it's useless, just say so. :)

Meantime, I've found All About Monads pretty approachable in the past, although Haskell comes with a bit of baggage for learning monads (esp. type classes and the strange and misleading IO monad).

Very good!

Thank you Dave!

I seem to understand Monads now after reading your article. Of course, I've said that before, so I'd better do an exercise to be sure :-). I tried to write the "collect even and odd numbers by effect" example from a recent discussion. I think my solution is close, but it gets rejected by the type checker and I can't figure out why:

import Monad
import Control.Monad.State

-- Separate a list of integers into evens and odds.
-- Return the results in an (evens,odds) tuple.
evens_and_odds :: [Integer] -> ([Integer],[Integer])
evens_and_odds list =
    state where (_, state) = runState (do mapM_ collect list
			                  return get) ([], [])

-- Collect an integer into the even or odd part of the state.
collect :: Integer -> State ([Integer],[Integer]) ()
collect n = do (e, o) <- get
	       if n `rem` 2 == 0
		  then put (n:e, o)
		  else put (e, n:o)
	       return ()

main :: IO ()
main = do putStr $ show $ evens_and_odds [1,2,3,4,5,6]

Can someone spot the bug - am I close? And how would a real Haskeller have written that (assuming an irrational desire to collect by effect)?

Anyway, the article is very good: it clearly explains (to a Schemer) what a Monad is and what it's for. The final monadic code is still fairly ugly and is doing something quite unsexy so I would have been unsatisfied if you hadn't included the link to All About Monads. That catalogue of Haskell monads and their examples seems like exactly the right next step from understanding the mechanics.

Thanks again!

IIUC (I can't run your exampl

IIUC (I can't run your example right now) the offending line is this "return get". Here "get" has type "m s" while return has type "a -> m a" so here you are trying to return the state transition "m s" instead of the state. You can do two things:

1. Substitute "return get" by

s <- get
return s

or even simply

get

which is equivalent.

2. Ignore this, because you don't need to return the state explicitly. The "runState" call has type "s -> (a,s)" and since you're applying this to the initial state ([], []) it will return the last value (i.e. () from the mapM_ call) and the last state (i.e. (evens, odds)).

Also you don't need to write "return ()" in the end of "collect" because "put" has type "s -> m ()", that is the last expression in the do block (i.e. "if ...") has the correct type and "return ()" is just a no-op.

Great

Thanks, that gets it working:
evens_and_odds :: [Integer] -> ([Integer],[Integer])
evens_and_odds list =
    state where (_, state) = runState (do mapM_ collect list) ([], [])

collect :: Integer -> State ([Integer],[Integer]) ()
collect n = do (e, o) <- get
	       if n `rem` 2 == 0
		  then put (n:e, o)
		  else put (e, n:o)
Obviously I'm not on top of this yet, but now I've got a foothold and I'm interested.

If someone could next show what makes type systems so interesting in terms that I'll understand then I'll be well on my way into the modern functional programming world.. :-)

macros

The final monadic code is still fairly ugly...

You're absolutely right -- I forgot the punchline for the Scheme version of the story. Monadic code is ugly precisely because you're making the threading of control explicit. Worse, you have to give a name to the result of every intermediate computation (hint: this is a big link between monads and A-normal form). The designers of Haskell added the do-notation as syntactic sugar for easing some of the notational overhead in the explicit sequencing. (There's not much you can do in general about the nuisance of intermediate names, though.)

Schemers don't need to ask the language designers for syntactic sugar, though: macros are our own personal confectionary. To complete the rand example, here's a quick monadic compiler (very minimally tested):

(define-syntax <get>
  (syntax-rules ()
    [(<get>) (get-seed)]))
(define-syntax <set>
  (syntax-rules ()
    [(<set> v) (set-seed v)]))
(define-syntax <return>
  (syntax-rules ()
    [(<return> v) (lift v)]))
(define-syntax <let>
  (syntax-rules ()
    [(<let> (a m1) m2)
     (pipe m1 (lambda (a) m2))]))
(define-syntax <begin>
  (syntax-rules ()
    [(<begin> m1 m2)
     (pipe m1 (lambda (ignored) m2))]
    [(<begin> m1 m2 m3 ...)
     (pipe m1 (lambda (ignored) (<begin> m2 m3 ...)))]))

Notice that we no longer need the earlier definition for begin, because it's just a special case of pipe. Finally, we can rewrite rand:

;; -> T(number)
(define (rand)
  (<let> (seed (<get>))
    (let ([ans (modulo (* seed 16807) 2147483647)])
      (<begin> (<set> ans)
               (<return> ans)))))

This should look familiar... If you're thinking, "well that's pretty much just what we had to start with when we allowed ourselves to use set!," then you're right. The idea is to simulate effects using a pure host language. If you already have the effects, there's no need to simulate them.

No magic?

So is it the case that given a lazy functional language with primitives for unrestricted side-effects one can still write real imperative programs (e.g. webservers) by programming in monadic style? And thus Haskell's Monad support is "just" a type system to enforce this style plus some syntactic sugar?

Re: No magic?

From my naive viewpoint, yes. This is even "admitted" in papers from various Haskell gurus, with a little note about how imperative programming in Haskell is safer than in most languages :)

Here's an interesting and easy to follow paper, though not the one I was looking for: Lazy Imperative Programming

Re: No magic?

given a lazy functional language with primitives for unrestricted side-effects one can still write real imperative programs

Depends what you mean by unrestricted, and probably depends on the language/implementation. Haskell implementers warn that you can't rely on the unsafePerformIO extension, a primitive that produces side effects outside of the monad. So it's not just a matter of throwing in some side effects and then the programmer can gain control of them.

What goes on behind the scenes is that the implementation tracks the primitives which produce arbitrary effects, and expressions built from them. It ensures that the primitives are invoked in exactly the sequence determined by the program. If this mechanism were made to sound more impressive, maybe it would qualify as magic.

You can create a web server in Haskell, but due to the IO monad putting every action into a sequence, you don't get a typically nice multithreaded web server. For that kind of thing there's an extension called Concurrent Haskell.

Order of evaluation, or, some magic?

I don't understand how Monads enforce order of evaluation. Dave's article shows that each operation takes the result of the previous one as an input, which is obviously some kind of dependency chain. It would be enough to handle Scheme's "strict, but function arguments can be evaluated in any order" rules to write a working begin function. But how do we know that an (e.g. lazy) evaluator will make the real-external-world effects happen in the right order

It sounds like you're saying this aspect is handled specially by the compiler/runtime system, is that right?

Dave's thesis committee will certainly want to know this. :-)

The New World

...each operation takes the result of the previous one as an input, which is obviously some kind of dependency chain. It would be enough to handle Scheme's "strict, but function arguments can be evaluated in any order" rules to write a working begin function. But how do we know that an (e.g. lazy) evaluator will make the real-external-world effects happen in the right order

Laziness works by wrapping constructors with a delay and selectors with a force. In other words, when you actually need to extract a piece of a value, you have to make sure the value has been computed.

So one way to ensure something gets computed in a lazy program is to select a piece of its result. (Of course, if that part of the program is delayed, i.e., the part doing the selection, then it won't be computed, nor if the part computing the part computing the part... and so on. At the top-level, a lazy program has to be kicked off or nothing will ever compute.) As you say, a program in monadic style builds up a chain of dependent computations: each successive computation requires the result of the previous computation before it can be executed. Then in the top-level (i.e., the monad's run function), it extracts the result of the entire computation. Since this result isn't available without running the last computation, the entire chain must be evaluated.

In the case of the IO monad, check out Imperative Functional Programming (link above), esp. sections 2 and 4. The idea is to thread a dummy accumulator that stands for "the state of the world" through the computations. This dummy is encapsulated in the abstract IO data type, so the programmer can't touch it. Since a non-IO function can't call an IO function, the only way IO operations can be kicked off is from the top-level main function. The intuition is that the REPL hands the current state of the world to main, which threads the world through its IO computations, and ultimately returns the modified world when the program completes. To kick off the entire computation, you can imagine the top-level asks for the final state of the world in order to force the computation to take place.

This chain of computations also enforces linearity, meaning that the modified "world" resulting from an IO computation will only be accessed once, namely when it is handed off to the next computation in the chain. This allows the Haskell implementation to perform the actual real-external-world operation in-place when it executes an IO computation, since it knows that this is the only place in the program where that state of the world will occur.

Blindingly simple...to non-FPers

The idea is to thread a dummy accumulator that stands for "the state of the world" through the computations.

And that's trivially easy to understand in terms of low-level programming. Once you start talking about the three laws, etc., then you've lost most people. It's like saying "now the first thing you need to understand about iteration is what a loop invariant is," when what programmers want to hear is "this is how you write a while loop."

Dave's thesis committee will

Dave's thesis committee will certainly want to know this. :-)

Are any of them still alive? I heard there was some kind of "accident" with a begin...

Monads are like CPS in this sense

If we convert a piece of code to CPS we can enforce the order of evaluation (specially useful in a lazy language like Haskell). Monads have a similar behavior. The type of bind is "(>>=) :: (Monad m) => m a -> (a -> m b) -> m b" which says: take a computation and a continuation and return the computation resulting from applying the continuation to the result of the given computation. That forces the left value of bind to be evaluated before the right value.

Also the monadic do notation translates to a series of binds and they look quite like continuations:



do x = (\x -> my >>= (\y -> return (f x y))

Monads and Scheme

Interesting introduction.

The paper Abstraction and Performance from Explicit Monadic Reflection" by Jonathon Sobel, Erik Hilsdale, Kent Dybvig, and Dan Friedman is quite interesting. It shows how to transfer the monadic paradigm to Scheme in a very way.

Other texts on monads and Scheme can be found at IdiomMonadicProgramming at the SchemeCookBook.

I guess I am redundant

Seems likes yet another explanation of mands isn't really needed, so I won't bopther you with one...

Not so

I took your advice and started reading Monads for functional programming. I'm taking a break now after section two but it's a marvellous paper, so thank you for the recommendation. I thought that I'd attempted it before but I must have tried one of his less introductory papers by mistake.

Of course I'm duly pleased that he expresses everything in plain ol' code.

Hm

Refering to typed lambda calculus as "plain ol' code" may be a sign that I need to re-examine some of my prejudices about formalisms. :-)

Sorry

I didn't mean to cut you off. (I actually think the more explanations the better, because I never understand anything the first time.)

No prob

Perfectly alright. Since I was quite busy all week it was great that others provided the explanation I promised to get back to.

Not just sequencing

There is actually no requirement that monads sequence things in any way, it is just one of the most common ways monads are used.

For examples of monads which do not 'sequence' their arguments, see (in the ghc haskell libraries)

Control.Monad.Reader which distributes a value to all of its sub-computations but otherwise evaluates them with normal lazy semantics.

or
Control.Monad.Writer which does something similar, but collects an extra value from its subcomputation and lazily combines them.

or for the grand finaly.

Control.Monad.Identity which is the identity monad. Monadic code run in this monad behaved IDENTICALLY to if you had written it without any monads at all. (monad errors are translated to bottom). This is a very useful monad indeed.

I am unsure whether learning about monads as general constructs, one thing (of many) which they will let you do is sequencing will help or hurt the new-to-haskell programmer.

Re: yes, sequencing

(I just noticed the above post. For posterity's sake, I don't want to let it be the final word.)

For examples of monads which do not 'sequence' their arguments, see

Monads do not have "arguments." However, computations in a monad are always sequenced. This is exactly what the >>= and >> combinators are for: sequencing computations.

Control.Monad.Reader which distributes a value to all of its sub-computations but otherwise evaluates them with normal lazy semantics. or Control.Monad.Writer which does something similar, but collects an extra value from its subcomputation and lazily combines them.

These examples absolutely involve sequencing. For example, the following computation:

doStuff :: Writer [String] ()
doStuff = do tell ["message 1"]
             tell ["message 2"]
             tell ["message 3"]

produces the following interaction:

*Main> snd $ runWriter $ doStuff
["message 1","message 2","message 3"]

The result list is in that order because -- as always -- the sub-computations in the monad are performed in sequence.

I am unsure whether learning about monads as general constructs, one thing (of many) which they will let you do is sequencing will help or hurt the new-to-haskell programmer.

I'm having trouble parsing that sentence, but I think you're questioning whether it's helpful for new Haskell programmers to think of monads in terms of sequencing. I don't know how else you'd like people to think of it. Do you want them to think about domains and Kleisli triples?

Simon Peyton Jones likes to describe Haskell as his favorite imperative programming language, and this is the theme of Imperative Functional Programming, that monads are a way of embedding imperative code, i.e. sequences of instructions, in the middle of (lazy) functional code.

fustigate (v.) 1. to beat with a stick; 2. to criticize severely

Roughly, a monad is commutative if, when both sides make sense:
do { x <- m; y <- n; p }  =  do { y <- n; x <- m; p }
By this criterion, the reader monad is commutative: it doesn't matter what order you read things in. The usual writer monad, however, is not commutative, and neither is the state transformer monad for most state types. A very important monad that is commutative is the substitution monad for an algebraic theory.
However, computations in a monad are always sequenced.

They are not sequenced in a commutative monad. In particular, a compiler which could prove that a monad is commutative would be free to reorder computations, because it is not an observable property.

Simon Peyton Jones likes to describe Haskell as his favorite imperative programming language, and this is the theme of Imperative Functional Programming, that monads are a way of embedding imperative code, i.e. sequences of instructions, in the middle of (lazy) functional code.

If you look at Wearing the hair shirt: a retrospective on Haskell, also by Simon PJ, you will see that one of the things he would like for Haskell is a way to distinguish commutative monads from non-commutative ones, precisely because the current scheme overly sequentializes things.

(BTW, another thing he says is that "our biggest mistake [in designing Haskell was u]sing the scary term "monad" rather than "warm fuzzy thing". :)

linearity, then

They are not sequenced in a commutative monad.

I stand corrected. Let's refine my explanation, then. Sequentiality is still critical for all non-commutative monads, because basically you're hosed if you can't guarantee that your effects will occur in the proper order, no matter what (non-commutative) effects you're talking about.

For all kinds of effects, both commutative and non-commutative, it seems linearity is the common element: even for the "fresh value" monad (e.g., gensym), where it doesn't matter if expression A gets g257 and expression B gets g258 or vice versa, just so long as each gensym occurs exactly once.

(BTW, another thing he says is that "our biggest mistake [in designing Haskell was u]sing the scary term "monad" rather than "warm fuzzy thing". :)

My friend wants to start a line of Haskell-inspired sleepwear known as Simon PJ's. I can only assume they'll be warm and fuzzy.

Damn you beat me to it...

Hmm, I saw the error too. However, from my layman's perspective - I wondered about the following. Is it the commutative property or the operational semantics of a monad (the IO monad in particular) which enforces the sequential operation of the composition?

I couldn't help but notice that in the Jones/Wadler paper they forgot to define a "run" function (yeah, I know it is impossible to define explicitly) but they just assumed the operational semantics to be - first do program A then give its output o to B and then perform program B.

Semantically, there is nothing, however, which enforces that program A is run to completion before program B. Worse, given the lazy semantics of Haskell, I think I can define a program A which returns a (lazy/partial) list of values as output o which can be interpreted by B which may force the rest of the evaluation of A. I.e. there is nothing in the monad which enforces the sequential evaluation of the programs A and B; rather, it seems to be a property of stringing together strict programs...

Right?

Note: I removed the most obvious typos.

"The source of all type errors are type systems."

Is it the commutative property or the operational semantics of a monad (the IO monad in particular) which enforces the sequential operation of the composition?

The former. For example, the IO monad in GHC also includes an operation forkIO.

Where are you going with that fork!

First, no idea what you meant with that forkIO remark. Please elaborate if you still feel like. (I don't)

Uhm, second, I should have phrased my remarks better.
It should have been stated something like:

Even if the IO monad satisfies all monadic laws and the monad is linear (non-commutative) is the correct operation of the IO monad not a feature of the operational workings of some implementation on an abstract signature instead of a feature of the algebraic laws that that signature satisfies?

Or something like that...

Whatever...

Maybe we should just conclude that IO monads are there to do IO stuff in a type-safe manner and leave it there? This is getting boring and I don't have the feeling anyone is interested. (I am not)

Naive answer

You have to evaluate

x = 2 * 3

and

y = 7 + 4

before you can evaluate

z = x * y

and I would be quite surprised to learn that this was a feature of "the operational workings of some implementation" that evaluated those expressions. My understanding of the matter is that both CPS and (non-commutative) monads exploit dependencies of this kind in order to impose/express sequencing of evaluation. Also that monads, at least, do so via their type signatures, which define the dependencies that must be satisfied for the evaluation of each expression.

I am not that sure too, however...

Program A does IO and returns a lazy list; program B takes it as input; while B evaluates that list it runs A to completion which does more IO. Thus the evaluation of A and B are intertwinned/not-sequential (to the outside world).

Most monads, over a hash-map for instance, when run, will be lazily evaluated (for example, inserting a key/value pair is "run-to-completion" on a map only after you ask for that certain key) in a lazy language like Haskell. It just doesn't matter, a programmer doesn't really notice, since it is a referentially transparent program.

However, as a concrete instance of the contrived example: what happens if I had a program which returns all names of files in a directory as program A, and a program B which erases all files in that directory with the inverse name. If the directory contains files "ab" and "ba" is it going to be empty afterwards or not? If A is strict, yes, if A is not, the directory might contain the file "ab".

Thus, sequential composition need not enforce sequential evaluation in general.

This is just a technical point with regard to the sequencing remarks. *yawn* boring! duh, grrr.

The naivety continues

I think the two possible behaviours for Program A and Program B actually look different in code.

Example one: the first monad runs to completion, constructing a complete list in the process that is then consumed by the second:

main :: IO ()
main = do fileDescriptors <- getFileDescriptors;
          deleteInverses fileDescriptors

getFileDescriptors :: IO [FileDescriptor]
deleteInverses :: [FileDescriptor] -> IO ()

Example two: a lazy list of monadic values is constructed and (presumably lazily) consumed by a fold:

main :: IO ()
main = deleteInverses getFileDescriptors

getFileDescriptors :: [IO FileDescriptor]
deleteInverses :: [IO FileDescriptor] -> IO ()

In the first example, deleteInverses must run the monad getFileDescriptors in order to get a complete list of FileDescriptors; in the second, it must run each of a lazy list of monads to get each of the FileDescriptors it will use. There is no way it can consume a FileDescriptor without running some monad, and hence no way it can operate on a lazy list of FileDescriptors: the type of an IO action consed to an IO action is [IO a], not IO [a].

Cogito ergo sum.

no idea what you meant with that forkIO remark.

You asked (or I understood you as asking) if there was anything intrinsic about monads which ensure that side-effects of operations are sequenced. My answer was, no, there is nothing, it depends on the monad. As a counterexample to the idea that side-effects are sequenced in every monad, I gave the IO monad, which supports an operation forkIO. Since forkIO creates a new process running in parallel with the current execution thread, one which may have side-effects, it is clear that IO is an example of a monad that does not sequence side-effects.

Je pense donc j'ai mal à la tête.

Having begun a whole discussion by saying "monads are all about sequencing!" and then having lots of people say, "no they're not!" I hope I'm not being stubborn, but it still seems to me that sequencing is very relevant. When you say

IO is an example of a monad that does not sequence side-effects

I don't quite buy it. Sure, not all IO effects are sequenced, since you can branch from the main "trunk" and the effects from multiple branches can be interleaved. But within each branch, the effects are in fact still sequenced. And the fact is that the world-passing style enforced by the IO monad still guarantees that sequencing. I mean, that's the whole point.

At least when you're using the unit/bind formulation, computations are chained together in an implicit sequence, since the result of one sub-computation is piped to the next sub-computation. If you have other operations, like forkIO, that allow you to create new orderings of operations, then the monad may not guarantee a unique ordering for all sub-computations, but there still are some sub-computations that have to happen before others.

Also, when people say monads can be used as a framework for talking about all kinds of effects, that sounds like voodoo. Explaining that the common element is talking about the ordering of the effects helps demystify it.

I suppose a fork's out of the question

The kind of sequencing introduced by a monad specifies that one thing must happen (be evaluated) before another can happen (be evaluated), but it doesn't rule out the interleaving of other things happening (being evaluated) at the same time, and not necessarily in any strict order either.

I kind of like the idea of "a framework for talking about all kinds of effects" - I find I want to understand it better.

It is hopeless...

really, if even very seasoned programmers can't understand it, or have a very hard time wrapping their head around it, then it is clearly a badly designed language feature.

Yes, monads are quite elegant from a pure theory POV, but if there ever was a feature that will prevent pure functional programming from going mainstream, this is it.

It doesn't help that even if you "get" monads, that in everyday usage even making small mistakes will spawn the most horific type errors from most implementations, at least last time I checked.

Give up. Go find a more intuitive solution for dealing with state. Yeah I know it is hard, sorry.

Paradigm Shift

Wouter: really, if even very seasoned programmers can't understand it, or have a very hard time wrapping their head around it, then it is clearly a badly designed language feature.

Poppycock. If these "very seasoned programmers" are only seasoned in C/C++, then there's a lot of stuff in Haskell that will blow their minds for a while. So what? That's the nature of a paradigm shift. In my experience, both with myself and many colleagues over the years, there's a certain amount of weeping, wailing, and gnashing of teeth anytime a new paradigm is used on a project, even if the language doesn't change—for example, I introduced boost::bind and boost::tuples on a C++ project recently, and still get questions about them. Good thing I didn't use boost::lambda, boost::signals, or boost::spirit...

This attitude is the thing I hate most about our industry: the notion that, hey, I already know enough; quit asking me to learn something new!

Wouter: Yes, monads are quite elegant from a pure theory POV, but if there ever was a feature that will prevent pure functional programming from going mainstream, this is it.

Monads aren't just elegent from a pure theory POV, although they certainly are that. They're also absolutely necessary if you wish to extend your use of specific functional strategies across certain boundaries. What I do agree could make a difference to the user of a language is whether, as a programmer, you must identify where one or more monads are necessary, arrange for their construction, apply any necessary monad transformers (which don't compose in the general case, which is my principle beef with monads), and so on; or whether the language might automagically ensure that code that has side-effects or does I/O, or both, is handled monadically, including the application of transformers that are known to compose in the necessary cases. In other words, how far can we take type inference in the service of automating some of the awkward squad?

Wouter: It doesn't help that even if you "get" monads, that in everyday usage even making small mistakes will spawn the most horific type errors from most implementations, at least last time I checked.

This is the usual complaint about an implementation rather than a concept. Helium is deliberately designed to do a much better job at this than most Haskell implementations, although it does so partially by sacrificing features. No doubt it and other implementations will improve over time.

I would also observe that using templates in C++ is absolutely no better. Perhaps the point is that we need something better than spitting out long strings of text for presenting errors in the context of a rich type system.

Wouter: Give up. Go find a more intuitive solution for dealing with state. Yeah I know it is hard, sorry.

Give up. Go learn enough Category Theory to be able to cope with monads. Yeah I know it is hard, sorry.

Difficulty of understanding monads overrated

Much as I struggle with CT, having neglected to struggle at sufficient length with the things CT is about, I don't think that understanding the way monads work in Haskell is particularly hard. Hutton and Meijer's tutorial on monadic parser combinators was a good way in for me, as it built up the monad combinators from scratch and showed how they were used to combine parsing operations. After getting through that, I felt I had my feet on solid ground.

You can use the monad syntax in Haskell without even understanding (or having once understood, then forgotten) that much. I admit that I still think of monads as being really "about" sequencing and threaded state, but as a way of thinking about the non-commutative monads that's good enough to be getting on with.

I too think that automagical monadification of side-effecting code would make life easier for the poor toiling programmer, and wonder whether anything of the sort has been tried. I would also note that monad transformers are not the last word in composing monads, as this paper for instance suggests.

Stones

I don't see what category theory has anything to do with this subject except that it confuses the subject even more. I don't think Wouter ever needed it when he defined any of his languages - and, if anyone, I guess he is the expert on languages.

Also, you don't need category theory to understand the monads of Haskell!!! They have perfectly clear operational or axiomatic or (who cares) van Wijngaarden semantics.

You know, one of the most elegant solutions I have ever seen which cleanly deals with side-effects is to simply make a distinction between functions (no side-effects) and procedures (may have side-effects). Functions may use other functions, and procedures may use other functions or procedures.

It is easy, there is no need for category theory, and everybody understands it. (Yeah, yeah, the category theorist will now stand up and categorically claim: but a procedure then has an IO monadic type! But, then assembly clearly is lambda calculus too, ain't it...)

PS: I am not that against category theory, I just wonder whether it is the right abstraction used in this case.

Ha!

You know, one of the most elegant solutions I have ever seen which cleanly deals with side-effects is to simply make a distinction between functions (no side-effects) and procedures (may have side-effects). Functions may use other functions, and procedures may use other functions or procedures.

Oddly enough, you can formalize this approach using modal logic.

There is no escape from crazy theory. :)

Nice idea

Actually, I like the idea. Since modal logic is logic w.r.t. worlds, and functional IO changes worlds, it seems pretty natural to express it in a modal logic.

Well, uhm, thanks ;-)

The logic of monads is

The logic of monads is modal, so perhaps this isn't so odd...see for example this paper by Kobayashi (couldn't find a freely accessible version), and this proof-theoretic treatment by Pfenning and Davies. For those (like me) who find this level of theory a bit too crazy, it's worth checking out several of the papers by Aleksandar Nanevski.

That's how I think

You know, one of the most elegant solutions I have ever seen which cleanly deals with side-effects is to simply make a distinction between functions (no side-effects) and procedures (may have side-effects). Functions may use other functions, and procedures may use other functions or procedures.

Actually, that is how I usually think of the IO Monad and the rest of Haskell. [non IO functions] may use other [non IO functions], and [IO functions] may use other [non IO functions] or [IO functions].

if only

A paradigm shift? that is too easy an excuse. Compare monads with another feature that requires a C++ programmer to rethink what they know, say lazyness. Lazyness is a clean, very transparent and simple construct, that will have a smooth learning curve, even for a C++ programmer. Once you understand lazyness it becomes so solid you don't need extra thought to use it. Not so monads, they are not transparent at all, not simple, and have a learning curve like mount everest.

Really, just because it is really different doesn't make it a new paradigm. Yes people are scared of learning new languages/features, and I wish it weren't so. But with monads it isn't their fault. If you think people should just persist a little longer it is time to step out of your small circle.

You Say "Pair-uh-dime;" I say "Pahra-dig-um..."

Wouter: A paradigm shift? that is too easy an excuse. Compare monads with another feature that requires a C++ programmer to rethink what they know, say lazyness. Lazyness is a clean, very transparent and simple construct, that will have a smooth learning curve, even for a C++ programmer. Once you understand lazyness it becomes so solid you don't need extra thought to use it. Not so monads, they are not transparent at all, not simple, and have a learning curve like mount everest.

Since we're now relying on anecdotal support for our arguments, I see your point and raise you with concrete recent experience using tools like boost::bind and boost::lambda to achieve laziness in side-effecting references. Laziness does not have a natural implementation in C++, not surprisingly, because C++ is eagerly evaluated. So you have to be careful to put boost::ref() around references that you wish to remain references in bind or lambda expressions. Basically, it's all about using nullary functors instead of "the objects themselves," and arranging for operator() to be called when you need the functor to be evaluated. And there's never a time in C++ that I get to ignore that, unless I'm writing purely functional code where order of evaluation literally doesn't matter. I'd be very interested in seeing such C++ code: I never have, and I never expect to.

Wouter: Really, just because it is really different doesn't make it a new paradigm.

Except when it does, of course. It's generally accepted in computer science that functional programming is a different paradigm than imperative or object-oriented programming. Texts such as SICP and CTM certainly treat them as distinct paradigms or concepts. Personally speaking, I find the way that I have to think about programming to be sufficiently different among them as to rise to the level of "different paradigms." But as the title jokingly points out, unless we're willing to get into formal definitions of our terms, it's just a matter of taste.

Wouter: Yes people are scared of learning new languages/features, and I wish it weren't so. But with monads it isn't their fault. If you think people should just persist a little longer it is time to step out of your small circle.

You're a bit quick with the ad-hominems for my taste; you presume to know the size of my circle of something, probably programmers that I know, something that I assure you you're ignorant of. Nor are you in an omniscient position with respect to who is able and who is not able to understand monads. I would propose that the existence of this thread, and the increasing number of explorations of monads in languages in which they are not a first-class concept, is suggestive both of their utility, even outside the context of a "purely functional language," and of their understandability, perhaps also as a function of their divorce from their specific implementation in Haskell. Personally, I wasn't able to get a handle on them until I saw implementations in Scheme and O'Caml in rapid succession. Perhaps the comparison between a latently-typed implementation and an implementation in terms of ML modules and types clarified things for me. In any case, I had a powerful learning experience, and I'm loathe to believe that such an experience would somehow be denied to other reasonably intelligent programmers by anything that's inherent to monads that isn't also inherent to higher-order functions.

Of course, that begs the question as to how understandable higher-order functions are.

sorry

for the ad-hominems, I will try to be more ad-verbum :) My impression of your "circle" comes from what you say, and saying things along the lines of "just go learn category theory" doesn't fit into the same real world picture that I have.

I have no omniscient knowledge about what programmers are like. But I teach classes to graduate compsci students almost daily... seeing what to them is difficult and what is not is really interesting. Even though these guys are probably brighter than the "average programmer", asking them to learn category theory just to allow them to be able to cope with underlying complexity of something that is the simplest thing to them right now, sequencing stateful operations, will be very surreal to them. And if these guys won't get it, then lets not even start about the armies of VB/Cobol/Perl/JavaScript/... programmers out there.

And no this would not be different if they had all started out on Scheme or whatever... fundamentally you have a language feature with simple functionality but complex underlying semantic details (== bad).

More Verbiage

Wouter: sorry for the ad-hominems, I will try to be more ad-verbum :) My impression of your "circle" comes from what you say, and saying things along the lines of "just go learn category theory" doesn't fit into the same real world picture that I have.

OK, that's quite fair. Let's recall that my "learn enough category theory" was just a structural edit of your "Give up..." The point wasn't so much that it's necessary to have a mastery of category theory to grasp monads as that there's a fork in the road: you can give up and say "monads are hard," or you can dig in and say "why do monads seem hard?" My personal experience is that they were hard as long as I only examined them in the context of my (limited) exposure to Haskell. The Scheme implementation and O'Caml implementation that I've seen both seem quite understandable given a multi-decade grounding in higher-order functions, which is why I wrote earlier that I don't see anything intrinsic to monads that isn't intrinsic to higher-order functions, and that that only begs the question as to how understandable higher-order functions are.

But that question is a serious question, especially given recent history on my work project in which I find myself explaining and re-explaining boost::bind and boost::lambda to colleagues. It's also why I do maintain that object-oriented, imperative, and functional programming are, in fact, different paradigms, and C++ is a multi-paradigm language, with the functional aspects still somewhat mysterious and underutilized despite the standardization of the STL and the availability of high-quality FP libraries such as FC++, boost::lambda, and boost::spirit::phoenix. They make my life easier, but at least at the moment, they appear to make some of my colleagues' lives harder.

Not really different.

Personally speaking, I find the way that I have to think about programming to be sufficiently different among them as to rise to the level of "different paradigms." But as the title jokingly points out, unless we're willing to get into formal definitions of our terms, it's just a matter of taste.

I don't think that imperative, functional and object-oriented programming are sufficiently different to allow for such categorization. They certainly don't make me think differently. I use higher order programming in imperative languages every day, and I organize my code around modules I call 'objects'...

Good metaphors

Lazyness is a clean, very transparent and simple construct...Not so monads

I have to say I'm surprised at this. On a superficial level, monads are easy to use by invoking the crude metaphor of a sequencer, but the full theory behind them is more complex.

On the other hand, laziness has fairly simple theoretic properties, but drastically alters how you reason about programs, especially in terms of performance.

I suspect that someone used to eager evaluation has a much harder time actually getting useful work out of laziness than out of a monad.

The only advantage that I think laziness has over monads is that they are identified using a helpful metaphor. Laziness is a nice anthropomorphic term (pace Dijkstra ;-)) that gives us a viceral idea of what it does.

Though "warm fuzzy thing" is certainly appealing ;-), perhaps calling non-commutative monads "sequencers" and commutative monads "summers" or "joiners" might have made things easy for people who mainly want to program without confusing or cheating those who wanted to go beyond into the theoretical nitty-gritty that unified them.

As an analogy, PL semanticists think about fix-point operators, but programmers don't have to know about these: they just use recursion and have to understand convergence to a base case.

Wouter, if you think I still haven't covered some property of monads that makes them harder to learn or work with, I'm interested to know. I don't think we can make progress without figuring out what causes a feature to be more difficult.

that would be its non-transparant nature

If I am a C++ programmers, and someone shows me the Haskell "do" syntax, I think, hey, that is easy, it is just like a statement list. Until I actually try and use it like that. I will create a Monad, maybe a bit weird, but ok, some examples, and I manage to create something. Now I write my do "statements". Something doesn't match up, and I get the weirdest type errors. For a C++ programmers, these are complete chinese to me. To have an instant understanding of these type errors, I need to be so deep into functional languages to the extend that only people on this board are.

Yes I know, implementation vs design. But we've had Monads for a very long time now, and still nothing has changed. To me that says something that the problem is more fundamental. Monads could be made a language feature, but frankly that would make a lot of things more ugly in favor of some small improvements in usability.

What Monads appear to represent (simple sequencing), and how they actually work internally, are quite far apart. It is not wysiwyg at all. Good language features have a clear 1:1 relationship between syntax and semantics.

But we've had Monads for a ve

But we've had Monads for a very long time now, and still nothing has changed.
Is 10-15 years really a very long time? Given that there's only a few languages that use monads for sequencing, and that these languages are used by a minority of programmers, I'd say your remark is unfair.
[Monads are] not wysiwyg at all. Good language features have a clear 1:1 relationship between syntax and semantics.
I don't think there's any way of supporting this claim without resorting to "human factors" (no, don't go there). For example: how can you match first-class continuations, which introduces non-locality to the flow of control, to any syntax that is tree-based? Even the simpler lexical closures serve as an example. The scope is captured by a closure and used later, non-locally: Syntax alone does not describe completely what closures mean. Try to justify syntax for lazy languages. Try to justify syntax for concurrent programs.

Syntax is fundamentally different from semantics. (And, syntax issues are mostly irrelevant) This "clear 1:1 relationship" is not clear at all beyond the simplest examples, and I think even these are based on past experience and not any objective measure.

You must walk before you can run...

If I am a C++ programmers, and someone shows me the Haskell "do" syntax, I think, hey, that is easy, it is just like a statement list.

The one catch with this example is that you have chosen a programmer only familiar with a non-FP language, without requiring them to understand FP basics.

Monads are mainly motivated by an FP style, so if one hasn't grokked how a pure FP environment works and why you might want to use such a thing, it seems pretty unlikely you are going to be using a language with monads anyway.

So this wouldn't be a problem with monads, but a problem with a more basic idea upon which monads depend.

What if we didn't know it was a monad?

Wouter wrote:

A paradigm shift? that is too easy an excuse. Compare monads with another feature that requires a C++ programmer to rethink what they know, say lazyness. Lazyness is a clean, very transparent and simple construct, that will have a smooth learning curve, even for a C++ programmer. Once you understand lazyness it becomes so solid you don't need extra thought to use it. Not so monads, they are not transparent at all, not simple, and have a learning curve like mount everest.
At first I thought this was a joke, since what I usually hear from "C++ programmers" is that laziness is hard and unpredictable, and the sequencing is transparent and simple.

One of the most ubiquitous monads of all time serves as an existence proof that C++ programmers can understand monads. Consider the Unix command line. Most commands have the ability to take a filename as an argument and produce their output to stdout; restrict your attention to them. There is a command cat that produces the contents of a file on stdout, and an operator | that pipes the output from one command to another. | and cat obey the "monad laws" (Unix programmers know this and use it to optimize complex command pipelines):

  cat filename | cmd  =  cmd filename
           cmd | cat  =  cmd
(cmd0 | cmd1) | cmd2  =  cmd0 | (cmd1 /dev/stdin | cmd2)

puh-leeze

really, if even very seasoned programmers can't understand it, or have a very hard time wrapping their head around it, then it is clearly a badly designed language feature.

Everyone says this about everything new, but high school students do calculus and 2-year-olds operate computers. New ideas start out hard until people figure out how to make them understandable. To quote Conor McBride, "the task of the academic is not to scale great intellectual mountains, but to flatten them."

Give up. Go find a more intuitive solution for dealing with state. Yeah I know it is hard, sorry.

Give it time. People have only just started trying to figure out how to explain this stuff. Phil Wadler solved the problem spectacularly for the research community, and now people are attempting to solve it for the wider community.

Remember way back when?

Go find a more intuitive solution for dealing with state. Yeah I know it is hard, sorry.

Well, I agree.

What I mean is, the way monads are treated in Haskell is not the best way to deal with state and effects. However, monads themselves are a very important concept, and should play a bigger role in future languages.

We already know that monads are not the best solution for handling effects. Let me quote from Park and Harper's A Logical View of Effects:

Our view is that monads are an abstraction that is more general than effects and that monads are a particular way to model effects. The first is based upon the fact that many datatypes that do not involve effects organize themselves into monads. For instance, it is easy to create a list monad by instantiating the type class Monad of Haskell. The second is based upon the observation that monads are not the only way to model effects. For instance, Nanevski [28] shows how to usefully model exceptions with comonads. Thus we are led to conclude that monads are not identified with effects and the study of effects is more fundamental than the study of monads. Plotkin and Power [36] present a similar view: "computational effects determine monads but are not identified with monads."

The second citation points to one of a series of papers by Plotkin and Power (and Hyland) which study effects algebraically; you can find the rest on Plotkin's page. (Disclaimer: I haven't read them yet.)

So every effect determines a monad, but not every monad determines an effect. Another (worse) way to say this is that monads are more general than effects. So if you approach monads from the perspective of effects, and try to understand what the monad laws have to do with effects, you are already going down the wrong path. This is one reason they may be hard to understand.

There are other reasons too. Let me try to suggest them.

First, monads in Haskell are a convention, not a language feature. There is syntax for them, and a class in the standard library, but the language does not guarantee that an instance of class Monad actually is a monad (i.e., satisfies the monad laws). In this respect they differ from datatypes; if you declare a datatype T, then you know that T actually satisfies the laws for an initial algebra. (Well, most of the time. :)

In my experience, programmers have more difficulty with program concepts that are only language properties and not language features. Because Haskell provides syntax, primitives and standard definitions which are intended to be used with monads, it gives the illusion that they are a language feature. But actually, programming with monads in Haskell is a bit like programming with closures in C: you can do it, by following a certain style, but C is not really the best language for it. (Don't read too much into this analogy, though.)

Second, I think there is too much emphasis in the literature on the notion of monads themselves versus specific monads. A monad is a very general thing, and not of much use in the abstract. Let me give you an example which will illustrate this. In the typed lambda-calculus, all of the major types and operations ((a ->), products, sums, datatypes) are given by adjoint functors, and every adjoint determines a monad. Yet they are conceptually very different, obviously. Don't get me wrong: the fact that they are adjoints is important, but not as important as the particulars of each adjoint and, hence, each monad.

Third, if you look at the categorical literature, you will see that the most important use of monads is for defining algebraic theories. In contrast, the most important use of monads in Haskell is for structuring effects. In fact, I think it was mostly unknown until Moggi's paper that monads had anything to do with computational effects. (A possible exception is the partiality monad.) Consequently, it is hard to apply the categorical literature to writing monadic programs.

I think this will change in the future, though I don't yet know what form it will take. Instead of using monads for effects, we will use them to define algebraic theories. In programming terms, this means "domain-specific languages". (Monads are used for this purpose now too, but mainly to deal with effectful languages; in the categorical literature, monads are used to encode the laws of a language: the reduction relation on terms.) In particular, monads have something to do with eval/quote, and you can see this already in MetaML. (They also have something to do with my isomorphisms stuff.)

So, having said that monads are not the best way to handle effects, I nevertheless believe that they will play an important role in the future, for other reasons. The syntax will be different. Probably it will be easier to understand, because the role they will play will be something which is more connected to essential features of all monads rather than the particular class that have to do with effects.

The last thing I want to say is that I think this statement of Wouter's is a bit off the mark:

if even very seasoned programmers can't understand it, or have a very hard time wrapping their head around it, then it is clearly a badly designed language feature.

You do not have to understand a feature to use it. The truth is we use features we don't understand all the time. For example, how many Haskell programmers can actually state the equational laws that Haskell programs satisfy? Probably not many. How many Java programmers actually understand classes and interfaces? Hardly any. They understand the general idea, but could not give you a formal sketch of it. What about state or laziness? Many people understand them in outline, but do they actually know what rules are sound for reasoning about them?

State, especially, is one of those things which we all have a very shallow understanding of; you can explain it using a store, and fetch/put operations, etc. But there is a big difference between that explanation and "the Big Picture": it's possible to grasp a definition without understanding all its consequences. In fact, it's so common that we developed the notion of theorems and proof to deal with it. That few people "understand" state is evidenced by the fact that it still frequently bites us in the ass.

With monads, perhaps, it is the other way around: they're easy to use, but hard to define. Yes, I don't actually believe that it is so hard to use monads. Once you get the difference between a computation and a value, and understand that "there's no way to turn IO a into a", it's pretty easy, at least if you restrict yourself to the do-syntax.

Sure, people coming from C++ will have trouble if you just throw them into Haskell and say, "OK, now if you wanna do stuff like in C++, you should use monads." Of course they'll be confused! The differences between C++ and Haskell are much more fundamental, and you cannot expect to sweep them under the rug like that.

(Wouter also complains about type errors in using monads, but this is partly because of the way monads in Haskell are tied to type classes, and anyway dealing with type errors is something a Haskell programmer learns to do in many other contexts as well.)

What I mean about it being the "other way around" is this: compare it with state. It's easy to define state in terms of a store, but it's deceptively easy: you don't see the problems until they bite you because it all goes on behind the scenes. With stateful monads in Haskell, though, the problems tend to stare you in the face, and you are forced to deal with them, explicitly. I am not saying here that one approach is better than the other; what I'm saying is that people who are not used to having all the consequences of using state thrown in their face at once might find it daunting. I mean, when people talk about "dealing" with state what they usually mean is debugging. :) (Don't talk about objects; objects leak state like a sieve.)

Here is a (not very good) analogy. The difference between explicit monadic state and implicit imperative state is like the difference between breadth-first and depth-first search. With BFS, you see all the options at once, but there might be so many that it is hard to cope; with DFS, you are blind and have to do a lot of backtracking, but it's less daunting because you see so little at once. (I know it's hopeless to say this, but: you should not read too much into this analogy.)

Better example

For example, how many Haskell programmers can actually state the equational laws that Haskell programs satisfy?

This was not a good example. A better example is things like POSIX primitives. Most people don't really understand buffering, or sockets, or whatever. They experiment with them, and eventually get a sense of how they work. Occasionally they make errors, and (if they figure out a workaround) their understanding of the system deepens. But ask them, "What is file I/O?" and I don't think they could give you anything but the most general answer. Similarly, answering "What is a monad?" is probably not going to be useful information for most programmers.

Of course, I think that it should be useful information, and programmers should know how to exploit it. And I think the lack of a formal specification for POSIX sucks.

My point is rather that you are holding monads to a much higher standard than you would hold POSIX. You want an answer to a mathematical question, but you would not demand a mathematical specification of POSIX I/O. (I would, but that's a different story. :)

We already know that monads

We already know that monads are not the best solution for handling effects. Let me quote from Park and Harper's A Logical View of Effects:

Thanks for the link! I can't wait to read this.

First, monads in Haskell are a convention, not a language feature. There is syntax for them, and a class in the standard library, but the language does not guarantee that an instance of class Monad actually is a monad (i.e., satisfies the monad laws).

I agree. I think another stumbling block is that it's hard to give a concrete answer in the context of Haskell of what the monad actually is. First of all, it's hard to think of a type constructor and two operations as a thing, and it sure isn't intuitive for a newcomer. Second of all, since the monad is really a semantic object, it's actually pretty subtle to distinguish the mathematical entity from the type class, the instance, and the implementation code. I agree that people can use technology without a deep understanding of it (who was it who said that famous quote about formal methods?), but I kind of think when Simon makes his joke about "warm fuzzy things" that it's not entirely a joke: maybe they should have called it a "design pattern" or something, because when someone is trying to grok Haskell and wants to know "what is a monad?" it's hard to give them a one-sentence summary that says, "a monad is X."

Instead of using monads for effects, we will use them to define algebraic theories. In programming terms, this means "domain-specific languages".

I don't know enough of the theory yet to really know what you mean, but I think I like this idea. My reaction to e.g. the State monad in Haskell has been to wonder why people are so excited about writing an interpreter for a useful language feature every time they want to use it. If the feature belongs in the language, it should just be in the language, right? Go ahead and use monads in the semantics, but why keep reinventing the wheel? But maybe DSL's are the meeting ground: leave the feature out of the general purpose language to prevent people from overusing it, and use monads to build up a DSL with nice algebraic properties when you need the feature often enough for a particular problem domain. Still leaves a lot to taste, but there might be a story to tell there.

This is exactly what I do

I kind of think when Simon makes his joke about "warm fuzzy things" that it's not entirely a joke: maybe they should have called it a "design pattern" or something[...]

That's exactly what I do these days. A design pattern really is just a moderately tricky algebraic structure, just like a monad. It's just that programmers think that they can understand a design pattern, whereas for some inexplicable reason they tend to get confused if you say "a monad is a composition of adjoint functors". ;)

What are the superior alternatives to monads?

The papers on Lawvere theories and modal logic are fascinating, but I don't understand them well enough to compare them to monads. Would it be possible to outline how these methods are different and why some are better than others? How might these differences manifest themselvs in actual programming languages? Is it possible to use these techniques in current languages, e.g. Haskell?

Where do John Hughes's arrows fit into this?

Must Agree With Wouter

Despite the protestations of others I am thoroughly convinced that Wouter is correct. So sad that here we once again have conflicting posts attempting to explain what apparently should be simple to explain but failing clearly, along with promises that monads are the "next big thing".

Monads are of interest primarily to academics and will remain so. Perhaps powerful but of limited use, like quaternions they are self-limiting in that they are not sufficiently approachable by the great unwashed masses of programmers (nor even of the top 5% of that group IMO). And like quaternions, there is likely a more acceptable structure to be found that either subsumes them or will eventually supplant them.

To paraphrase Wouter, give it a break.

Don't be so sure. F#

Don't be so sure. F# computation expressions are bringing monads to the masses. Not doubt they are still an advanced feature, but they will be accessible to a much wider audience than they were before.

Monads to the people

Along the same lines you have .Net LINQ which is a monad comprehension plus stuff for ordering and grouping plus the expression tree stuff, and then you have Scala's for-comprehension which is a monad comprehension plus an imperative foreach.

Even without monad comprehensions, I've seen monadic parser combinator libraries for Java and Ruby.

Speaking of monadic parser

Speaking of monadic parser combinators, FParsec deserves a mention here as a good example of computation expressions being used to good effect in F#.

Thanks!

Give up. Go find a more intuitive solution for dealing with state. Yeah I know it is hard, sorry.

Thank you for explaining this to all of us. We are sometimes overly inclined to treat programming as an advancing field.

I think we can also apply your argument to things like the actors model, generic datatypes, locking and even type inference. These are the kinds of things that give even seasoned programmers serious trouble; if we could just get rid of them and the application areas served by them we would would all be better off.

Get serious.

We all know that real programmers stick to Peano arithmetic. Integers are for sissies. Talk about an entitlement mentality! :-))

A trade-off

There is trade-off between the complexity of a feature and the advantage it brings for programming.

Researchers don't care about this complexity (and I would even add that they like it as it may be an oportunity of research) but programmers do care about it: too much complexity to understand or use for too little advantage? The feature won't be ever become widespread..

I consider that monads and locking are in this category but not actors model, generic datatypes.
Type inference is more fuzzy, are you talking about local type inference or global type inference? Those are quite different.

There's a spectrum to such things, as you say?

iiuc, while the actor model is perhaps likely to result in less broken concurrent systems than shared-mutable-state-locking, it doesn't e.g. guarantee no deadlock. also, iiuc, seems like stm is trying to claim to go one step further wrt deadlocks? other alternatives like lock-free seem to do away with one boogy man (deadlock) only to run the serious risk of inviting another one across the threshold (livelock)?

point being that i consider actors to still require attention and real thought to get right; one man's not complicated may be another man's complicated. so then you have to ask who is the desired audience in any case. e.g. maybe monads simply aren't meant for Joe Java.

Monads in Perl

FWIW, I wrote a little tutorial that attempts to explain monads by implementing them in perl. Things became much clearer to me once I had divorced the idea of monads from their concrete representation in Haskell. Heck, once I figured monads out, I even dabbled with monadic parser combinators in perl.

yamt

or, yet another monad tutorial... :)

just about the IO monad in Haskell, but in a concise writing style refreshingly distant from the likes of "A Gentle Introduction to Haskell"...

i found it to be pretty instructive:

http://haskell.org/haskellwiki/IO_inside

in particular, this little section:

http://haskell.org/haskellwiki/IO_inside#What_is_a_monad.3F

Do Haskell compilers treat the IO monad specially?

Do Haskell compilers treat the IO monad in a special way? If I write my own monad, would it be optimized in the same manner the IO monad is optimized? is it a requirement for a Haskell compiler to treat the IO monad in a special way?

It's special in the sense

It's special in the sense that it can't be implemented in Haskell, but there's no requirement beyond that. OTOH, it's likely to come in for at least some optimisation, and the implementation is effectively 'magic'.

Nice tutorial.

It's impressive how monads sequence computations and turn a program into imperative style...functional programs start with referential integrity and slowly become imperative whereas imperative programs are turned to functional by compilers doing SSA analysis.

In the end, it seems both approaches end up with the same result!

Not quite true - monads

Not quite true - monads potentially allow more to be specified in the source code, it's easier to insist the presence of an effect be specified than its absence be proven. This isn't yet being used by Haskell compilers as much as would be nice, although when you do the analysis a surprising amount of 'monadic' code is pure scaffolding building things up and the compilers certainly do take advantage of the properties of pure functions.

You could have invented monads!

Sigfpe's superb monad tutorial was mentioned previously, but since this thread is linked from the Getting Started thread, let's mention it here as well:

You Could Have Invented Monads! (And Maybe You Already Have.)

Instead of invoking the fearsome specter of Category Theory and then hand-waving it away, sigfpe describes how monads arise naturally as straightforward solutions to certain types of problems. He even lets you derive them yourself, provided you don't scroll down too quickly.

Sorry for resurrecting this

Sorry for resurrecting this thread...

Philip Wadler has quite a number of papers on monads. Which one(s) should I start with?

I like this one, but

I like this one, but opinions may vary...

I find that one more

I find that one more enjoyable than the very similar one, Monads for Functional Programming so I recommend it as well.

Also, to add, it is completely acceptable even encouraged to "resurrect" old threads.

Thanks Ehud and Derek.

Thanks Ehud and Derek.

intuition for imperitively trained programmers

I like this one. It's in the same spirit as the "You Could Have Invented..." link above but is lower level:

In the monad way of doing things, instead of calling a magic function to perform output, you return values. Instead of calling a magic function to perform input, you return functions (almost callback-style) that can later be invoked with the new input. That's about all there is to it.

And, oh yeah, in this model: RAM is also treated like I/O. Programs don't directly touch RAM -- they output things to be stored and read back in things previously stored.

Then, when the imperitively trained programmer comes around to "But wait, isn't that incredibly awkward?" you can start talking about closures and continuations and other things that make it incredibly easy (and clean).

-t

Yet another tutorial.

James Iry has been writing an introduction to monads as series of articles in his blog. It is in Scala, something that may appeal to programmers jumping into the FP world coming from Java. I'm finding it quite clearly explained and well written.

Part 1
Part 2
Part 3
Part 4

meaningOfLife :: IO ()

Reading the comments here, I've learned that I'm on the dumber side of the smart people group. I got a book Type and learn C when I was 13--my very first book on programming--which I finally made it through in a couple months. Almost one of which was spent going through chapter 7 or 8 (can't remember) on POINTERS. Thinking aloud while contemplating the meaning of such a complicated idea: "So a value is stored IN a variable which is AT a memory location that, if dereferenced, then uses THAT value to retrieve the VALUE stored at the memory address equal to the FORMER value." But for the most part, it was very easy. X = 3 means will x == 3 until it gets changed again. Just like how things work in real life.

The reason monads (IO monads) are so difficult for me in Haskell is because they don't work at all how they're "supposed to". I'm going through Haskell.org's monad tutorial and can't figure out a way to get the supposed while loop to work. while-anything other than (return False) appears to be true. I tried to x left-arrow getInt and (return x less-than 3) as the logic to break the loop, the idea being that a user types in 2 and it returns False::IO Bool. Either it ran forever or a slightly different attempt at the coding turned into some ridiculous error more or less equal to: Instance of (Num [a] => Ord (char (((a Int => [[a]] -> wtf) Bool) IO () ). Then I said "Of course!" and promptly fixed the error, and was well on my way to parsing combinatoric metaparadigms using advanced hyperbolic topology.

Doing things in Haskell is easier in the sense that, once you figure out how you should've done it in Haskell, the problem is simple. Once you figure out that the function you are computing is "just" the uncurrying of the \x -> mapping of the function that folds the reversing of the IO Int + x list parsed into sublines of length 4 or list!!3, whichever's greater, you can then write one pretty line of code with 20 characters that has a runtime of 4x the length of time it takes you to write it in asm and execute it 305 times as you wait for the Haskell copy of yourself to play catch-up because he was too busy admiring how quicksort looks as he procrastinated coming up with an actual solution to the current problem. If it were really that easy to reason about and program super high-order stuff I would just write a Haskell program me = god. Except that implementing god is very hard indeed. Besides you can write int me() { return god(); } in C++ anyway for the same effect. Yes, god returns an int. Though some religions think it's a char. That's why we're at war with them.

Hmmm

Sorry to hear that you're not taking to Haskell... Rest assured that what you're going through is normal, and that it is possible to get past this phase, although yes, it does take time and effort. Whether it's worth it to you isn't mine to say. Anyway, this is getting pretty off topic for LtU.

The Trivial Monad

I had read all the usual references, but the light didn't really come on for me until I read sigfpe's mercifully short article on The Trivial Monad.

For a non academic developer

I had an indeed long journey to find out thy holly monad and yet I am on my way.

The best starting point seems to be The Marvels of Monads.

Monads are more than just about sequencing. As I have unrestood it:

I want to do something on an object of a specific type (in a generalized way). For doing so I should first amplify it with an enhanced type (constructed type). The Monad design pattern lets me to work with this enhanced type (which is a Monadic type itself). (As said in "The Marvels of Monads") In fact Monad design pattern provides a mechanism for composing computations (think of functions) on the enhanced type (amplified values).

We can do function composition by having the identity function and right identity, left identity and associativity (where functions have types like f:a -> a).

By employing Monads we can do composition on enhanced types. Now if we look into Monads we can see equivalents of identity, associativity and composition. Identity is like unit, and composition is like bind and the bind operation provides us associativity.

Now we want to put it to work. We can use Monad for piling code (sequencing or making side effects). We can think of a changing state like a series of functions that in a chain of calls provide us a changing value. And we can use monads even for making parallel computations (like in PLINQ).

The magic point of a Monad is in fact its "bind" computation. In bind you can decide on how the computation should take place. You can pile it and simulate a side effect, or decide to execute computations in turn or even distribute them among accessible cores on your machine. And you can even decompose the computation inner structure to interpret it meaning like LINQ to SQL which translate our lambda expressions into SQL expressions on binding them. It is good to remember the "meaning" of a Monad (what it is meant to be) relies in the implementation of it's "bind" method.

(In unit you can not do magic things because it breaks the monad laws)

These are some of my own tries to understand Monads: Where is the magic point of a Monad? and Monad Described Imperatively.

My own question: I do not understand how we can talk to Monads when we are in a nested Monad. How can I know in which Monad I am? How can I speak to some upper (?) Monad?

Note: The first title was "For a none academic developer"; apologies for mistakes in my english writing.

Notes from another none academic developer

As one who has not had an epiphany on monads up to this point, I offer a couple of random thoughts, with the thought that it might help the teachers understand the predicament (from one who is quite willing to put his ignorance on public display):

A couple of possibilities for my current state of non-enlightenment:

  • I'm not intellectually proficient enough to come to terms with such abstract things;
  • I have a mental block - I have to reconcile monads with my formative use of Pascal and Assembly;
  • I'm not trained enough with the necessary background - you can only learn what you almost know;
  • I might understand monads in its rudimentary form but don't have the feel for how it all fits together;
  • I don't have the time to get the concept into my head;
  • I've not had the religious experience - there are those that understand and those that don't - with no middle ground.

Any or all of these are distinct possibilities.

That said, my limited feel for monads is that it is very much about sequencing, so I don't know that it is helpful to get into the complications of commutative monads - especially since Haskell has no solution for defining that property for its monads. It also probably doesn't help to complicate the notion of sequence by referring to things that we don't normally associate with the abstract idea of sequencing. Of course, we could adopt terminology that is is probably more correct - like Dave does with Linearize above. But in my process of trying to grapple with Monads, running with the more overloaded metaphor of sequencing is probably more optimal - but since I'm on the learning side, rather than the teaching side, my opinion may be wrong.

In investigating monads in the past, I've come to the conclusion that when people say they can't understand monads, they are generally approaching it from a different angle:

  1. What are monads in the mathematical sense?
  2. What are monads good for?
  3. What are the mechanics to defining and using monads in Haskell?
  4. Does Haskell make monads easy to work with?
From your post, I gather you are primarily addressing the second question - which might be construed as whetting the apetite so that one can proceed to either question 3 or question 1. Most naysayers would point to question 1 and say that if using monads requires one to understand such abstract concepts, then perhaps it is a futile attempt to introduce monads to any but the elite. My reading on question 1 is that Category Theory in general and Monads in particular are based on dirt simple axioms - the disconnect is that I want to leave question 1 as soon as possible, but I can't make the connection between simple postulates and the practical application.

Actually I think question 4 is the most unanswered question of all and probably the one that is most pertinent to LtU. I'd conjecture that since many of the explanations of monads are tied with evangelism of FP and Haskell, that they don't care to dwell on the limitations of the language they are using to demonstrate their understanding. From my limited knowledge, I understand that monads are more a design pattern within Haskell than they are language feature. And as Peter Norvig might say (and many FP advocates echo), design patterns point to limitations of a programming language. Programming languages are implicitly designed to be tools that are used to teach us how to think. So the question is whether Haskell can truly convey an understanding of how to go about using monads? Attempting to use monads in most any other language is torturous at best, so Haskell is the hands-on medium of choice. But it is probably best to understand that it is not a perfect.

Agreed

Agreed with most of your points. I have not decided to define Monad because I thought the provided link refers to a good explanation.

- "Attempting to use monads in most any other language is torturous at best"

I do not agree. For example consider LINQ in C#; it is pretty straight forward. And I think in every dynamic language - or every language with good support for Duck Typing - we can employ Monads seamlessly.

And I has not Haskell Monads in mind because to be honest: I am not fluent at them yet.

- "From your post, I gather you are primarily addressing the second question"

I do not know how far I was successful. But now I know to not hesitate to say any "AHA!" in front of Monads. Actually I think of Monads like if they are a toolbox of:
1 - A tag or a structure that we want add to our values
2 - A mechanism/function for making that
3 - A mechanism/function for applying our...functions
This toolbox lets us to controlling the mechanism of applying things to the tagged/constructed value; and this ability to control; gives us the magic.

I do not know if it is enough but I know I want simple definitions.

Language embedding perspective

Actually I think of Monads like if they are a toolbox of:

1 - A tag or a structure that we want add to our values

2 - A mechanism/function for making that

3 - A mechanism/function for applying our...functions

This toolbox lets us to controlling the mechanism of applying things to the tagged/constructed value; and this ability to control; gives us the magic.

That's a good start. In the LtU context, given some basic understanding of subjects such as lambda calculus, continuations and CPS, and language semantics, it's not that difficult to go a step further and reach an understanding of the general nature of the "magic" of monads. The following is a sketch of this.

Monads are often described as modeling computations. Well, you can't really model a computation without a language of some kind. Monads model computations by allowing a specialized language to be embedded in the host language. (In current net meme terms, "Yo dawg, we heard you like computation, so we put a language in your language so you can compute while you compute." Ahem. I've clearly been reading sites other than LtU.)

Monads model languages by allowing you to define two critical features of a programming language: the type of values in the language (point 1 in the quote above), and the function application operation for those values (point 3 above). Point 2 is essentially a generic interface to the type's constructor, in the form of the return function.

The lambda calculus demonstrates that a typical useful language can be modeled using four core components:

  • the values that the language supports
  • function abstraction (e.g. lambda), to define variables
  • function application, to bind variables to values
  • a way to reference the values of variables

A monad defines two of these for itself — values and function application — and the other two operations, function abstraction and variable reference, are handled by the built-in features of the host language. Together, these features support essentially arbitrary embedded languages.

This is taken a step further by the pattern embodied in Haskell's do notation, which is syntactic sugar for a chain of monadic applications in which monadic values are passed (via the monadic application operator) not to an ordinary function, but to a continuation. The continuations here are Haskell lambdas which contain "the rest of the computation", i.e. the remainder of the chain of monadic applications. "The computation" here corresponds to the do block. Do blocks are thus written in a monadic continuation passing style, which makes it possible for monads to manipulate continuations, if desired, and adding to the range of capabilities of monads.

This brings us to the connection with PL semantics. Monads were originally applied to programming languages by Moggi, in an attempt to modularize definitions in denotational semantics. If you look at a non-monadic denotational semantics definition, such as the Scheme denotational semantics in R5RS, you see global features such as the environment and store being threaded throughout the definition as function arguments. This makes for rather complex and non-modular definitions. Monads allow these pervasive parameters to be abstracted out, greatly simplifying the pure functional definitions of programming languages.

This allows a programming language to be defined by mapping terms in some source language quite straightforwardly to denotations in a pure functional target language. Essentially, when you write a do block in Haskell, you're acting as a compiler for some source language and emitting denotations for that language in Haskell.

In many cases, this is not particularly onerous, although if you're really embedding a fully general-purpose language this way, using a special-purpose evaluator starts to become attractive, to allow you to program directly in the source language.

Learning the basic uses of monads doesn't necessarily provide much insight into this language-embedding aspect of their nature. Arguably, such insight isn't necessary to use monads, but this generality - they fact that they're essentially a language embedding mechanism - helps explain why people often feel they haven't grasped monads even after they're capable of using them.

Reflections

Thanks for your descriptions; yet I do not like do-notation in Haskell. I do not know why (Maybe just because I do not know Haskell fluently).

From what you said I made some reflections:
So If I expand the 3rd part of my toolbox (A mechanism/function for applying our...functions which lets us to controlling the mechanism of applying things to the tagged/constructed value) I can say (in a generalized way):

Monad (IMO in fact the implementation of the 3rd part of the toolbox; the 'bind') lets us to inject an interpreter into our language!

This is what LINQ do, isn't it?

Do notation is closely

Do notation is closely related to ANF, an intermediate representation for compilers. That might explain an awful lot of your distaste?

And all (Haskell) monads can be expressed in a form involving an interpreter, yeah. In fact, my talk at AngloHaskell this year was on mechanically going from that form to the more normal one.

Abstractions

I was tinkering on Arrows and suddenly I ask myself: "These things have no relation to my daily job! What the hell I am doing?"; and I heard myself: “I am abstracting!”.

I think that would be a good help for everyone who tries to understand Monads. Just keep in mind we are abstracting things. But what? Here is what I have ended up (Have in mind things like functions, application, composition,...):

One can think of Monad as “abstracting the (function) application (process/operation)”.

One can think of Arrow as “abstracting the (function) composition (process/operation)”.

One can think of X as abstracting of Y (Association? Even the Identity itself? Environment? … I think we end up in virtual machines or virtualizing again!)

Abstracting application maybe something like this: We have some applicable things and some values and we can apply the applicable things to values in a context. What we actually have in Monad definition and Monad rules are implementation of this concept; we need some environment (Monad) and we need a tool to push values into it (return/unit) and we can apply a applicable thing like f to them by bind.

In fact when I have not a full grasp on Arrows yet (maybe on Monads too!) yet when I first had read about them I felt one can easily think of it as a Monad mistakenly. For example it seems what is needed to make concurrent computations, is not a Monad, but an Arrow! In fact by abstracting the composition process/operator, we have control on where this computation (in a chain of compositions) occur. One can think of an application like f x as a composition like (f o id) x; and this chain/tree of compositions (function calls) can be executed concurrently.

How arrows fit into your point of view of Monads as interpreter injectors (which was much of an insight to me)?

Monads describe higher-order

Monads describe higher-order applicative languages, arrows describe a particular flavour of compositional language. In particular, arrows frequently describe a first-order language - that's an important distinction because there are things you can do with a first-order language that you can't with a higher-order one. That's how self-optimising combinators get enough purchase to work, for example.

You might find applicative functors worth toying with as well. They describe a weaker flavour of applicative language that may well not have higher-order computations.

The return of chaos

As a total aside, my own personal stumbling block has less to do with monads than it does in recovering the ability to lapse back into chaos. I want to be able to do what I am currently doing in non-pure languages. The purists would tell me that I shouldn't be constructing chaotic programs in the first place. And then they tantalize me that if I really need, I might be able to use monads to bring order to the process.

A trivial (if arbitrary) example of chaos (in Alice ML) would be something like: (injecting such behavior as random numbers, state variables, concurrency, I/O, etc)

(* concurrent chaos - aka reasoning about state/encapsulation *)
local
   val object = ref 0
in
   fun chaos n =
      let
         val delay = Random.int 5000
      in
         spawn ( Thread.sleep(Time.fromMilliseconds (Int.toLarge delay));
                 object := !object + 1;
                 print("Sequence=" ^ Int.toString n ^ " Evaluated=" 
                    ^ Int.toString (!object) ^ "\n") );
         delay
      end
end
val x as (_, _, _, _, _) = (chaos 1, chaos 2, chaos 3, chaos 4, chaos 5)

which might randomly produce results something along the lines of:

val chaos : int -> int = _fn
val x : int * int * int * int * int = (1841, 1204, 1167, 4816, 1556)
Sequence=3 Evaluated=1
Sequence=2 Evaluated=2
Sequence=5 Evaluated=3
Sequence=1 Evaluated=4
Sequence=4 Evaluated=5

When I say I don't comprehend monads, it means (a) I don't know if monads can model all the chaos here; and (b) for those pieces that it can model, the code doesn't appear to be what I consider to be straightforward.

Chaos is a monad

(The existence of random monads, probability monads, and Monte Carlo monads make the subject line quite legitimate, and happily they really "are" monads, and don't have to be shoehorned to fit, so marco couldn't complain, even if he hadn't taken a vow of silence on the matter.)

(a) I don't know if monads can model all the chaos here; and (b) for those pieces that it can model, the code doesn't appear to be what I consider to be straightforward.

(a) monads can easily model this, although some might complain that they currently do so in Haskell by shoving all the necessary effects into the IO monad;
(b) the code is reasonably straightforward, with some caveats that apply to monadic code, discussed further below.

A tutorial here is a bit out of scope, so I'll just give an example, written for readability by non-Haskell wizards:

import Control.Concurrent (forkIO, threadDelay)
import Control.Concurrent.MVar (newMVar, takeMVar, putMVar)
import System.Random (randomRIO)

chaos object n = forkIO $ do
  delay <- randomRIO (0,5000000)
  threadDelay delay
  i <- takeMVar object
  putStrLn $ "Sequence=" ++ show n ++ " Evaluated=" ++ show (i+1)
  putMVar object (i+1)

x = do object <- newMVar 0
       mapM (chaos object) [1..5]

The code is straightforward enough, given a basic understanding of do notation and monads — the kind of thing that good monad tutorials cover — as well as some familiarity with the library functions being used.

Save that code to a file chaos.hs, run "ghci chaos", and enter "x" at the prompt. The output should be something like this (I ran it under GHC 6.8.3):

*Main> x
[ThreadId 29,ThreadId 30,ThreadId 31,ThreadId 32,ThreadId 33]
*Main> Sequence=3 Evaluated=1
Sequence=5 Evaluated=2
Sequence=2 Evaluated=3
Sequence=1 Evaluated=4
Sequence=4 Evaluated=5

Different results will be printed each time you evaluate x, making for more apparent chaos than an ML version! ;)

I've cheated a little here - this won't work when compiled, because the main thread won't wait for the child threads to terminate. This is discussed at the bottom of the doc page for the Control.Concurrent
library, which gives examples of how to write a waitForChildren function to address this. This doesn't have anything to do with monads or Haskell in general, though, except perhaps in the sense that the core libraries are sometimes deliberately fairly low-level, for reasons of generality.

Getting back to the real topic - there's no question that having to use monads here adds a layer of complexity that has to be learned and dealt with. That can be seen as a tradeoff one makes for purity, at least right now.

I have little doubt that future pure-ish functional languages will provide friendlier approaches. Arrows seem like one useful approach, and Erlang offers another sort of model. But each of these approaches has its pros and cons. I doubt that any single one of these will be the final answer, which means there's value in understanding and knowing how to use monads for functional programming right now. I know that some professional OCaml programmers agree, so this goes beyond pure languages.

Speaking of purity, what are the semantics of object := !object + 1; in the Alice ML code? Might that not produce a race condition in which two threads both read and then increment the same object value, or even a thread overwriting a bigger value with a smaller one (which becomes more likely on multicore machines)? Along similar lines, how much chaos do you want here - what prevents the individual characters of output from different threads from being interleaved, other than chance and the relatively long delay value?

The use of MVars in the Haskell version guards against both of these issues, and while that can't entirely be attributed to purity or monads, purity definitely helps to keep you on the straight and narrow in such cases. Having a core language feature that allows a mutating x=x+1 to be used without restraint, which can produce subtle race conditions, is not all that dissimilar from having unrestricted memory pointers - possibly useful in a low-level language, but is it really a good idea in a high-level one? It's a good way to lull programmers into a false sense of complacency, at least!

There's no way Arrows can be

There's no way Arrows can be "friendlier" than monads. Every monad induces an arrow. If Arrows were easier to use then one could just use the Kleisli arrow, but no one actually does that.

One way that would actually be "friendlier" would be an effect typing system or something along those lines.

Finally, the monadic Haskell version* seems at least comparable to the ML version (arguably, even superficially more familiar to the stereotypical imperative programmer.) It certainly doesn't clearly illustrate monads adding a "layer of complexity."

On a different tack, all of these effects could be modelled with monads (and in Haskell) obviously. This certainly would be more complex, but that's just the complexity of the effects and their number and their mixing. This would still be only like a page of code but not the kind of code most beginners would be able to write unless they had a fairly good prior understanding of denotational semantics. Not using monads, though, would not make it any less complex. In my opinion (supported by some experience as well), it is not actually monads that beginners have trouble with but modelling the effects. Most programmers can not readily provide a denotational semantics for state, say. If I ask someone to implement a State monad and they can't, it is almost certain that they can't model state denotationally at all. If I ask someone to model state denotationally and they can, it is very unlikely that they would have any trouble then making a State monad.

*Incidentally, your code looks like what I'd expect anyone to write in Haskell. The only things I might do differently is use forM instead of mapM and maybe write threadDelay =<< randomRIO (0, 5000000)

There's no way Arrows can

There's no way Arrows can be "friendlier" than monads. Every monad induces an arrow.

I was thinking of a couple of things here: first, the relationship between functions and arrows seems a bit more direct, which can make the essential nature of arrows a bit easier to grasp initially. Second, I was thinking of cases where arrows make more sense than a monad, with the vague idea that in future, handling of effects in Haskell could become less monad-centric if other approaches were integrated as tightly as monads have been with do notation and the IO monad.

[The monadic Haskell version] certainly doesn't clearly illustrate monads adding a "layer of complexity."

I agree that superficially, reading the code, it doesn't. However, writing the code, particularly for a beginner, requires a level of understanding of monadic programming that I don't think has a direct parallel in a more imperative context — such as the ML code, which I'm assuming is imperative to the point of bugginess, with its object := !object + 1. So I believe that the imperative style is typically easier to learn initially, but ultimately more error-prone and less tractable, particularly in the concurrent case.

The only things I might do differently is use forM instead of mapM

I figured non-Haskell functional programmers might guess the semantics of mapM more precisely than forM. (Don't believe any rumors that claim I used mapM because it's in the Prelude, to keep the line count down.)

and maybe write threadDelay =<< randomRIO (0, 5000000)

I'm sure it's obvious, but that was to reduce the number of different operators being used. It also illustrates that the code can be written naively, without even knowing that the reverse bind operator exists.

I agree with you here

Aside from the theoretical perspective. I really think that the future for FP is something close to arrows where an arrow should be a functional block/module with local state which is instantiated once.

FP and OOP

"Aside from the theoretical perspective" - since I am not so good at it - to what extend we can comprehend functions in JavaScript as arrows as you defined here?

I have a reflection on using OOP in an upside-down fashion here:
- objects as the state of it's methods (something - aside from the theoretical perspective - like a shared closure)

It may seem odd at first glance; yet I think it is the source of expressiveness of JavaScript. io is another programming language with this "sense" of expressiveness. These languages does not follow the mathematical direction as something like Haskell does (I think), yet they represent these concepts in another pragmatic way of describing things.

Monads are Inversion of Control au naturel

I have thought of Monads as a higher order version of IoC (for just separating/abstracting execution path of computations); which gaves us the opportunity to hook into the language itself. I have found a more general and interesting post on this topic here; which corrects me with a more generalized point of view.

In my opinion (supported by

In my opinion (supported by some experience as well), it is not actually monads that beginners have trouble with but modelling the effects. Most programmers can not readily provide a denotational semantics for state, say.

I think that's it exactly. Explicitly having to model the state machine they implicitly build in an imperative language is the chasm most people have trouble crossing. I'm starting to believe it's worth it though.

Everything is a monad

(The existence of random monads, probability monads, and Monte Carlo monads make the subject line quite legitimate, and happily they really "are" monads, and don't have to be shoehorned to fit, so marco couldn't complain, even if he hadn't taken a vow of silence on the matter.)

I'll break my vow, a bit, without complaining, I hope. Another observation: everything, or almost every ADT, has a monadic implementation.

For example, I can model a graph by building a graph monad with an initial empty graph state, and then update the graph with my own actions.

Stated differently, you can, but you don't need to, give an imperative implementation for all constructs you can think of in a functional language. But that kind of beats the intention of FP, doesn't it?

So, monads for dealing with side effects, yeah well, I can dig that to some extend. (Note that that even only works since Haskell has lazy semantics. I.e. I cannot (I guess) evaluate two IO monads.) [Actually, this is wrong, a value in IO is only evaluated when supplied to the top level.] However, monads for modelling ADTs in a functional language... Nah.

(Anyway, I guess that dealing with IO can always be done by using an abstract type and update functions on that type. Which, in my view, makes all the monadic fuss a tad bit overdone.)

[whatever]

Break my vow of silence again

Anyway, my biggest problem with monads is not that it's hard to explain. Or even that I don't like monads from a theoretical view (the minimal model for the IO monad is actually a model which _doesn't_ have a state, this is naturally true since one cannot observe anything which goes into the IO monad. I find that a bit troubling. [I assume the IO monad is one of the few monads which only abides the 3 monad laws, hence you cannot look into it.])

My biggest thing with monads is that I don't see them scale.

For example. I wrote a compiler. The whole enchalada is pure and strict, with one exception: I use a counter to generate unique names and labels.

Now, the same program build purily would have that counter in a monad, or another solution, interweaved through almost the whole program.

To me, it would be annoying if after a month of coding I would discover that I would need to weave a monad through a substantial part of my code, just to do some trivial side-effecting code somewhere in a corner of my program.

Now one could solve that problem by, say, tagging functions where one would have a monad passed implicitly, thereby labelling these functions impure.

Guess what. That has been tried before in other languages. End result: most functions (pure) which later needed to be adapted to be procedures (impure) were forced to be pure anyway.

So, to me, Haskell has a construction, over which a lot of fuss has been made, which solves a problem I know won't be solved that way anyway.

But that's just me. And yeah, I do applaud those Haskellers for trying. And guess what, to date, Haskell is still my favorite language. I guess it's just natural to hate the most what you love the most.

[This is actually why I build my (stage 1) compiler in ML. It's not that I like it over Haskell. It is that I wanted to be on the save side of things.]

[This is actually an argument against purity, not against monads I guess. Another debate.]

monads and haskell

I'd agree that monads are basically a design pattern. They are the pattern to fix up the core problem (to quote Peyton-Jones):

We begin with an apparently fundamental conflict. A purely functional program implements a function; it has no side effect. Yet the ultimate purpose of running a program is invariably to cause some side effect: a changed file, some new pixels on the screen, a message sent, or whatever. Indeed it’s a bit cheeky to call input/output “awkward” at all. I/O is the raison d’etre of every program. — a program that had no observable effect whatsoever (no input, no output) would not be very useful.
Well, if the side effect can’t be in the functional program, it will have to be outside it.

Or to paraphrase Abelson, "We experience time sequentially, and we want software which can model this". So yes the sort coming that monads address is well known. Almost every Monad I've ever used in Haskell would not even an issue/concern in Visual Basic. But they do arise naturally in other languages when you have states which can mismatch: client / server, database vs. memory objects....

On the plus side, Haskell has great syntax for working with Monads (except for the fact that all Monads aren't Functors automatically). Now, it is certainly fair to say that FP/Haskell has the worst problems with state and thus the greatest need for Monads but if you want to learn Monads Haskell does make it easy:

1) Because the needs are so great you end up using them for things like I/O and random number generation.

2) Haskell currying is fantastic and currying is key to this solution.



So I'd say in answer to your four questions:

(1) It doesn't matter
(2) Handling most situations where you want to add additional functionality to a large block of code that would otherwise require changes every dozen lines or so easily
(3) Declare a type to be an instance of monad and define 2 functions bind (>>=) and return.
(4) Yes.

My Aha Moments.

1. Playing Robo-Rally, a board game, and realizing that I could model the instruction cards as a monad.

2. Monadic Parser Combinators by Graham Hutton

3. Slowly coming to realize that almost anything can be fit into a monad, somehow. Most definitions aren't monads, but most concepts have suitable monadic definitions. With that caveat, saying that something cannot be a monad is a far more interesting statement to me than saying something is.

As Anton pointed out, you can do pretty much anything you want, anything at all, inside a monad. Almost any language that you might desire can be created through suitable definitions of return and bind, and providing "primitive" monadic operations.

One of the more interesting Monads I've created was a bare-bones user-level concurrency monad implemented via continuations; it was this monad that I used to create a proof-of-concept experiment in Haskell, and then translate it into PVS to prove a theorem about race conditions.

The statement that some concept is a monad seems to be a statement more about the way you are expressing that concept than anything about the concept itself. There are meta-implications of being a Monad that I don't honestly understand very well.

The languages that can't be expressed as a monad, for the most part, seem to me to revolve meta-issues such as type theory, and not intrinsic semantic limitiations.

For example, if you want your embedded language to have certain static guarantees, or be implemented in certain ways to efficiently handle various cases, then the monadic approach might make this very difficult or impossible.

Monads take effort, time, and patience to understand. It took a a while to wrap my head around them, but I think they are far easier a concept than say, continuations.

another suggestion

I'd also suggest (as above): You Could Have Invented Monads! (And Maybe You Already Have.). Also Monads as Containers I think is excellent.

But ultimately the way to get them is to write one. Make it a stupid one, but write one, and then it clicks.

On You Could Have Invented Monads! (And Maybe You Already Have.)

Oh! How could I forgot it!? Thanks for reminding me! It gave me a sudden!

It's just an observation

Monads are a way to study denotational semantics from a categorical perspective. [Any structure consisting of a few functors abiding a few laws is called a monad.]

Now in Haskell, IO is a monad, in the categorical sense. I find that an irrelevant observation from a programmers view. There are many more monads, and one doesn't need type classes or other constructs to define them (see my poor mans monads). Even the class Monad is a bit wrong since you can define instances which are not monads if they don't abide to the monad laws. Then there is also a problem that the analogy sometimes fails [or category theory isn't very good at precisely describing operational semantics] since monads can be used to describe how computations are chained, but usually isn't very successful at describing what that means in an operational sense (i.e. if I combine two computations you don't know which computation is executed first [and people assume that when two calculations are chained they are executed in left-to-right order, which might not be true]). [This actually isn't a problem of category theory but a general problem of mathematics. Normally, one can only describe execution order by describing that as a side node or stating it as an explicit rewrite system.] [There is also a small problem that in category theory I think they usually don't explain what it means for a value to be mapped into a monad, or how it can be extracted again. Both are things a programmer usually wants to know.]

I personally feel that programmers shouldn't need to understand monads or category theory to program in Haskell (or any other language). It just confuses the beginning programmer.

(As a side note, they just as well could have named the IO monad 'Existential' since it hides state. That would have been just as irrelevant.)

[Again, I am not bashing on people who are justifiably studying programming languages from a categorical perspective. I am just stating that these observations are irrelevant and confusing to a beginning programmer.]

Some counter-observations

Monads are a way to study denotational semantics from a categorical perspective. [Any structure consisting of a few functors abiding a few laws is called a monad.]

They can also be seen, perhaps more simply, as a simplifying pattern for pure and/or lazy functional programs. From this perspective, their utility for programming is more directly obvious.

There are many more monads, and one doesn't need type classes or other constructs to define them (see my poor mans monads).

That may be true for individual monads, but it's not true for more general use of monads. If you doubt this, I recommend trying to implement the modular semantics approaches described in Modular denotational semantics for compiler construction or the related thesis and paper by Sheng Liang et al., while avoiding type classes or a similar mechanism.

I personally feel that programmers shouldn't need to understand monads or category theory to program in Haskell (or any other language). It just confuses the beginning programmer.

What do you feel the alternative should be, for a pure language? Or are you saying that there is room for better alternatives?

(As a side note, they just as well could have named the IO monad 'Existential' since it hides state. That would have been just as irrelevant.)

Coming up with externally meaningful natural language names for very general abstractions can be problematic. This is more of a limitation on our natural language, than on the formalisms which do a very good job of abstracting common patterns.

We (our species) made some progress when we were able to abstract from, e.g., a group of n goats to the number n which captured their cardinality. I wonder if early goatherds used to complain that it shouldn't be necessary to understand "integers", and by extension, number theory, just to herd goats?

[Again, I am not bashing on people who are justifiably studying programming languages from a categorical perspective. I am just stating that these observations are irrelevant and confusing to a beginning programmer.]

Again, it would not be less confusing for any programmer, beginning or otherwise, to have to write pure programs without a feature that addresses the issues that arise in those programs.

Somewhat tangentially, surely the only languages that should be designed expressly for beginning programmers are teaching languages?

ADT

I personally feel that programmers shouldn't need to understand monads or category theory to program in Haskell (or any other language). It just confuses the beginning programmer.

What do you feel the alternative should be, for a pure language? Or are you saying that there is room for better alternatives?

Most 'monads' can easily be modeled, and sometimes even better, through abstract data types which hide state; for example with existential types and higher-order functions. (There are many ADTs where one would like several 'binds' and several 'returns'.)

Trying to 'fit' every ADT into a monadic structure I find just hinders the correct development of 'clear' modules.

We (our species) made some progress when we were able to abstract from, e.g., a group of n goats to the number n which captured their cardinality. I wonder if early goatherds used to complain that it shouldn't be necessary to understand "integers", and by extension, number theory, just to herd goats?

Overabstraction, or an abused abstraction, is worse than using the right abstraction. Academics have a tendency to read too much in abstractions; the monadic pitfall is one example where a metaphor is simply used wrongly. I find the analogy too shallow and the metaphor more misleading than illuminating.

This is akin to goatherds understanding numbers and being bothered by philosophers stating "hey, you're counting! If you define 'zero' as 'λ f x. x' and 'addition' as 'λ m n f x. n f (m f x)' you'll understand it better." [In some sense, it is even worse. I would say that the current trend is something like: 'Hey, everywhere you do something which looks like you're counting, use Church numerals!']

[On abstraction:

Main Entry:
    1ab·stract Listen to the pronunciation of 1abstract Listen to the pronunciation of 1abstract
Pronunciation:
    \ab-ˈstrakt, ˈab-ˌ\ 
Function:
    adjective 
Etymology:
    Medieval Latin abstractus, from Latin, past participle of abstrahere to drag away, 
    from abs-, ab- + trahere to pull, draw
Date:
    14th century 

Translation: Anything you drag far away enough starts to look like a dot.]

[Uh, I guess I should say 'no offence intended']

Languages

Somewhat tangentially, surely the only languages that should be designed expressly for beginning programmers are teaching languages?

From experience, I think that somewhere around 60% of programmers don't even understand recursion or don't know how to apply it. I would therefore restate that as:

The only languages that should be designed expressly for programmers are teaching languages.

Different level

Depends surely wether those programmers were self-taught or not.

As an aside understanding recursion, OOP, continuations are all much *easier* than understanding monads..

I irritate myself being irritated with something which irritates

while hoping not to irritate others. (Well, not too much.;-))

They just needed something in Haskell to perform IO in an imperative manner. For that you need some datatype which is unobservable (so you can't force evaluation), with an abstract value which you cannot construct yourself (which is 'supplied' by the environment so you can't start the evaluation yourself somewhere in the middle), on which you can chain update functions (while preferably interweaving local computations), and which has certain operational semantics (left-to-right evaluation).

They put all that into a monad and called it monadic IO. The monad just gives an imperative sugar to IO. Its not the monad which makes the IO work, it's all of the above.

I unfortunately had to figure that out myself (going through a ton of literature) before nice wiki's existed which explain it easily. Hence the irritation.

Grrr....

Everything is a monad

I think people grasp rather well that IO has to be specially treated in Haskell after giving them a short explanation. They are more puzzled about "monad oriented programming" as the latest programming paradigm and hardly anyone has sold it as such yet. A programming paradigm is not about immediate utility ( e.g. the IO monad as a special solution for a particular problem that can be ignored in non-lazy languages ) but about design value. It's the promise of having a benefit from organizing code in a certain way and it is also always a belief system because good design cannot be benchmarked. A paradigm is a mindset that can reverse the relationship between utility and value: the IO monad is not a special hack anymore but it is the right thing to do from the very beginning while everyone else does it essentially wrong or simply "doesn't get it".

Once the pentecostal movement of a new paradigm has been activated it is unremarkable that so many people are talking about their experiences with monads, that so many tutorials are written about them or even that you try to explain/excuse your own initial confusion and highlights the moments of clarity. We might not believe in the father and the son anymore but at least in the holy spirit.

Perfect

By the way, I have been meaning to reply to this for three weeks. I agree with you 100%, and I think this is a great comment.

Slight disagreement

I found it nonsense.

Of course

Naturally I figured some people would, that's why I wanted to reply.

If you care to elaborate on what you find to disagree with in Kay's post, maybe we can discuss it.

Frankly, I think a lot of technically-fascinated people would prefer to ignore cultural processes entirely, and retreat to a (fantasy) world where all change is driven by (and maybe even synonymous with) absolute technical progress.

I think a lot of

I think a lot of technically-fascinated people would prefer to ignore cultural processes entirely...

LtU always tried to emphasize the social (or cultural, if you prefer) processes.

Not that I am entirely sure I buy it as an explanation for the specific issue at hand, mind you.

Of course!

LtU always tried to emphasize the social (or cultural, if you prefer) processes.

Of course, that's one of the reasons I like it so much!

Not that I am entirely sure I buy it as an explanation for the specific issue at hand, mind you.

Fair enough. I wouldn't go so far as to claim it as the singular explanation, but I do think it's an element...

Well okay, since I am invited to respond


I think people grasp rather well that IO has to be specially treated in Haskell after giving them a short explanation.

Agreed.


They are more puzzled about "monad oriented programming" as the latest programming paradigm and hardly anyone has sold it as such yet.

I personally would call it a design pattern.


A programming paradigm is not about immediate utility ( e.g. the IO monad as a special solution for a particular problem that can be ignored in non-lazy languages ) but about design value.

I personally get annoyed by this 'design' pattern. From an FP view, I would argue it is the opposite of what you want. It introduces an imperative pattern, which I think is the opposite of what you want in FP. (Note that both Peyton-Jones, and I think Wadler, have responded in emails on that monadic programming might not be what they wanted.)
I personally think for FP the solution should be combinatorial programming.

God, if you put a monad into you banner for FP (like the monad reader), I personally feel you might as well put an assignment statement together with it.


It's the promise of having a benefit from organizing code in a certain way and it is also always a belief system because good design cannot be benchmarked. A paradigm is a mindset that can reverse the relationship between utility and value: the IO monad is not a special hack anymore but it is the right thing to do from the very beginning while everyone else does it essentially wrong or simply "doesn't get it".

No way. As I stated before, the monad is just used to give IO an imperative sugar. IO works in Haskell because a set of very careful invariants are maintained, not because it is implemented in a monad. The monad has nothing to do with it.


Once the pentecostal movement of a new paradigm has been activated it is unremarkable that so many people are talking about their experiences with monads, that so many tutorials are written about them or even that you try to explain/excuse your own initial confusion and highlights the moments of clarity. We might not believe in the father and the son anymore but at least in the holy spirit.

Yeah well, a bit over the top for my taste. Also, there was no excuse.

As I stated before, the

As I stated before, the monad is just used to give IO an imperative sugar.

But you have noticed that monads are used for plenty of other things from error handling, over parser combinators to expressing continuations. This cannot be undone by making authoritative statements like yours and insist on staying faithful to their original motivation.

I've nothing against the idea of monads being design pattern. A paradigm is a design pattern that lets you reverse your optics and you start to look at the world through this pattern. If monads were just a clever way to implement IO in Haskell they were far less controversial and there wasn't 1001 attempts to explain them.

Tssss........


But you have noticed that monads are used for plenty of other things from error handling, over parser combinators to expressing continuations.

Yes, it is a design pattern where you can interweave operations w.r.t. a particular state. That means you can model almost everything in it. [Since that state can model a parser, an error, a graph, a map, a http connection, etc.] Doesn't mean you have to, or that it is even a particularly good design pattern.


This cannot be undone by making authoritative statements like yours and insist on staying faithful to their original motivation.

I just said there is no relation between I/O and monadic programming.


If monads were just a clever way to implement IO in Haskell they were far less controversial and there wasn't 1001 attempts to explain them.

Again, as stated before, there is no relation between I/O and monads except that in Haskell they decided to put I/O in a monad class.

[Let's drop it? I guess I just find monads overrated, that's all.]

More technical

If you want to chain actions, there is nothing wrong with function composition. (f o g)

If you want to perform calculations w.r.t. to a local state, write a combinator and use abstractions. (f $ \x y z -> g x y z)

If you want to do I/O in a lazy language, hide the World state, and use the above.

If you want to build a combinator library, look at what combinators you need. If you need to do some calculation w.r.t. some hidden state, use the above trick with the abstraction.

Who needs monads?

More technical?

If you want to do I/O in a lazy language, hide the World state, and use the above.

Can you elaborate on what you mean here? Maybe just give pseudo-code that reads a character from stdin and prints it?

BTW: bonus points for ending a post with "Let's drop it" and then replying to that post an hour later with new arguments! ;)

Duh..

I responded because I thought some people don't follow what I actually mean.

I don't need to give examples. All monads naturally fall into that category: just replace $ with bind.

[Actually, the readChar example I would find to small. The more interesting examples are modelling UI or protocols with combinators instead of monads]

[Ok, not fair, here's a search libray in pseudo-code ;-) :


import "system.hi"
import "list.hi"
import "calculate.hi"

namespace search (

    using system
    using calculate
    using list

    type search_result = \t => [ _top t | _bot text | _exc text ]

    type search = \a \b => calc b (search_result a) 

    def success:a -> search a b =
        [ r -> nret (_top r )]

    def failure:search a b =
        nret (_bot "")

    def exception:search a b =
        nret (_exc "")

    def message:search a b -> text -> search a b =
        [ p, m -> p <*
            [ _bot "" -> nret (_bot m)
            | _exc "" -> nret (_exc m)
            | x       -> nret x ] ]

    def nofail:search a b->search a b = 
        [ p -> p <*
            [ _bot m -> nret (_exc m)
            | x      -> nret x ] ]

    def parallel:search a b -> search a b -> search a b =
        [ p0, p1, s0 ->
            [ (_bot r1, s1) -> p1 s0
            | l             -> l ] (p0 s0) ]

    def serial: search a c -> (a -> search b c) -> search b c =
        [ p0, p1 -> p0 <* 
            [ _top x  -> nofail (p1 x)
            | _bot m  -> nret (_bot m)
            | _exc m  -> nret (_exc m) ] ]

    def apply: search a b -> (a -> c) -> search c b =
        [ p, f -> p <* 
            [ _top x  -> nret (_top (f x))
            | _bot m  -> nret (_bot m)
            | _exc m  -> nret (_exc m) ] ]

    def sequential: search a c -> (a -> search b c) -> search b c =
        [ p0, p1 -> p0 <* 
            [ _top x  -> p1 x
            | _bot m  -> nret (_bot m)
            | _exc m  -> nret (_exc m) ] ]

    def opt: (a->search a b) -> a -> search a b =
        [ p, r0, s0 ->
            [ (_bot r1, s1) -> (_top r0, s0)
            | l            -> l ]
            (p r0 s0) ]

    def serialopt: search a b -> (a -> search a b) -> search a b =
        [ p0, p1 -> p0 <*  
            [ _top m -> opt p1 m
            | l      -> nret l ] ]

    def <+> : search a b -> search a b -> search a b =
        parallel

    def <-> : search a c -> (a -> search b c) -> search b c =
        serial

    def <?> : search a b -> (a -> search a b) -> search a b =
        serialopt

    def <*> : search a c -> (a -> search b c) -> search b c =
        sequential

    def <!> :search a b -> text -> search a b =
        message

    def <@> : search a b -> (a -> c) -> search c b =
        apply


    def one:search a b -> search (list a) b =
        [ p -> p <@> [a -> cons a nil ] ]

    def plus: search a b -> search (list a) b =
        [ p ->
            one p <?> \l0 ->
            plus p <@> \l1 -> append l0 l1 ]

    def star: search a b -> search (list a) b =
        [ p -> plus p <+> success nil ]

    def plus_sep: search a c -> search b c -> search (list a) c = 
        [ p, q ->
            one p <?> \l0 -> 
            q <-> \u ->
            plus_sep p q <@> \l1 -> append l0 l1 
        ]

    def star_sep: search a c -> search b c -> search (list a) c = 
        [ p, q -> plus_sep p q <+> success nil ]


    def run: search a b -> b ->
            (a->b->c) -> (text->b->c) -> (text->b->c) ->
            c  = 
        [ srch, start ->
            [ f, t, e ->
                (a,b) = srch start;
                [ _top x  -> f x b
                | _bot m  -> t m b
                | _exc m  -> e m b ] a
            ]
        ]

)

This is a lexer 'type lexer = \a => search a reader.reader'

]

[So the question now becomes: why would I shoehorn that into a monad?]

I responded because I

I responded because I thought some people don't follow what I actually mean. I don't need to give examples.

But I only ask for an example because I still don't see what you're talking about. What roles are f and g supposed to have? The only way I know to encode IO in a pure functional language is with CPS style. Is that what you mean? Are you talking about abandoning purity?

(And BTW, my "let's drop it" comment was lighthearted)

Ok

I thought so, so I was editing my original post at the same time.

IO


But I only ask for an example because I still don't see what you're talking about. What roles are f and g supposed to have? The only way I know to encode IO in a pure functional language is with CPS style. Is that what you mean? Are you talking about abandoning purity?

Linear types can also be used to model IO. No, why? I just state that since there is no relation between IO and monads, there is no reason to shoehorn IO into monads (just a return and a bind). Use whatever you like.

The bad part is that people are convinced monads are the key to IO. They are not, a set of invariants are.

IO and monads

As soon as you decide to reify program behaviors as values, it looks to me that you're pretty much stuck with passing continuations to constructs like 'getChar'. Once you go down that road, there is a natural monad about - the monad of partial sequences of operations (return yields a value without any external operations, bind composes sequences of operations). Whether you choose to call it out as such is up to you, but I think it's pretty naturally tied to IO.

Linear types don't really model IO the same way that monads do. Linear types are just way to prevent side-effects during evaluation from hosing your otherwise pure language.

Monads do not model IO

Where in the monad laws can I find that the world state is inaccesible? What do they say about order of evaluation? What about programs in which I return values of type (IO int, IO bool), how do the monad laws enforce that they are not evaluated [in parallel or sequential]?

Really, the correspondence is not more than superficial. [In the same sense that I don't think that that search library should be seen as a monad.]

[IO can be modelled with monads, that's something else.]

[getChar: (a -> IO) -> IO -> IO would do it.]

Monads are a structure

You're the one putting up "monads model IO" as a straw man. A monad structure arises naturally when modeling IO, but obviously that abstract structure does not account for the whole of IO. The statement "monads have nothing to do with IO" is analogous to "groups have nothing to do with the integers."

getChar: (a -> IO) -> IO -> IO
I don't think you need the second (IO) parameter here:

getChar :: (Char -> IO) -> IO
putChar :: Char -> IO -> IO
endio :: IO
main = getChar $ \a-> getChar $ \b-> putChar a $ putChar b endio

There is still a natural monadic structure (of partial sequences of operations) lurking nearby. [Edit: In case it's helpful - it's just a CPS monad, shown below]

type IO' a = (a -> IO) -> IO

getChar' :: IO' Char
getChar' = getChar

putChar' :: Char -> IO' ()
putChar' c k = putChar c (k ())

>>= :: IO' a -> (a -> IO' b) -> IO' b
(m >>= f) = \k -> m (\x -> f x k) -- edit: fixed bug

return :: a -> IO' a
return x = \k -> k x

In my opinion, IO and other sequences-of-operations would benefit from being directly modeled on top of a richer structure than just a monad, but why not abstract operations that apply to all monads?

groups have nothing to do with the integers

This is exactly the point which I think is being missed in this discussion. If the only group you study is the integers then you don't need group theory. As long as the discussion revolves around the IO monad there is no need to talk about monads. But the whole point of monads is that they are an abstraction which means we have a common interface to a variety of instances. This is where the payoff comes. In the case of Haskell we get. among other things:


  1. a library of monad related functions that give useful functionality for all monads simultaneosuly
  2. a single sublanguage, do-notation, that provides, in effect, convenient domain specific languages for things such as state transformation, continuations, combinatorial searching, imperative programmming and combinations thereof
  3. the opportunity to share reasoning between different monads because experience gained with one monad can be transferred to another
  4. and all the other benefits you expect from code reuse when you have a shared interface

Agree, except...

The statement "If the only group you study is the integers then you don't need group theory" seems reasonable, but there is often good reason to abstract even when you only have one example - it can aid simplicity and clarity. And summarizing the statement as "groups have nothing to do with the integers" is a stretch.

Dissecting monads

Good, I agree with all the points sigfpe made. The question in this forum was what monads are. Now, I think all arguments have been laid on the table.

1. Monads are a construction from category theory.

2. Monads are a design pattern used in Haskell.

[3. IO is modelled with that design pattern in Haskell.]

Now, monads in Haskell are named after the monads from category theory. Some people, like 'Matt M' believe there is a deep correspondence, I believe that there is a shallow correspondence. Feelings are not facts, so there is no real end-conclusion here.

I believe the design pattern argument. I also believe that it works for Haskel. But I would also like to emphasize that the monadic design pattern also introduces an imperative manner of thinking into FP. This is, again, more a matter of conviction than anything else. No result here too.

As far as Anton's argument goes: there are no real other candidates than monadic IO. Well, I am partly convinced. There are linear types and other solutions like arrows. And as long as we don't look at other solutions because the monad way is the way to go, well, we won't find any other solutions, will we.

It starts to boil down to believes. I think we should leave it at that.

[On 3. There is monadic IO, and there are other forms of modelling IO. I would say that it therefore is more a design decision, than the natural way to go. Again, a matter of conviction.]

Dissecting understandings

As far as Anton's argument goes: there are no real other candidates than monadic IO.

That sounds as though my argument has been misunderstood, in a couple of ways.

First, I was talking about systems for handling effects in general, not about I/O specifically. I don't believe I've mentioned I/O before now. If the only issue were I/O, there are various other ways to address that. But since I/O is an effect, it's naturally dealt with by whatever mechanism is used to handle effects in a language.

Second, regarding other candidates for handling effects, that point came up because some of the criticism of monads were assuming that equally good alternatives to monads are already available — e.g., ad-hoc combinators. I disagree with that, and explained why.

I'm not arguing that monads should be the ultimate solution for handling effects. However, the dearth of existing alternative solutions, and the weaknesses of those that do exist, make some of the criticisms here rather empty.

There are linear types and other solutions like arrows.

I speculated earlier about arrows myself, but I notice no-one in this thread has been arguing in favor of using linear types over monads. That discussion would at least involve some concrete comparisons.

And as long as we don't look at other solutions because the monad way is the way to go, well, we won't find any other solutions, will we.

I'm all for looking at other solutions. That's what I was getting at in my previous comment, when I wrote "We don't seem to be actually discussing such an alternative framework". But to discuss monads (the subject of this thread) in the context of other solutions, we'd need some agreement on the basic facts of what monads offer, which we only now seem to be starting to converge on.

Alternatives

Linear/uniqueness types are used successfully in Clean, so that's a viable alternative. I can't say it is a particularly attractive solution since it really places a burden on the programmer and the type system, but it works well for them.

It seems the stream processing approach died because it is difficult to model method calls since you need a kind of split-phase modelling. Note that in languages like NesC split-phase is actually seen as 'the way' to implement reactive systems. I understand arrows as an advanced form of stream processing functions.

I think monads are more popular because they a) make it relatively easy to implement bindings with impure libraries b) they fit the mind-set of the imperative programmer. a) is good, b) yeah, well.

I personally like stream processing/arrow-like solutions more than monads, or uniqueness types, because they exhibit a component-like structure while retaining purity.

I think other approaches might come from process algebras or action semantic frameworks, but I haven't seen any yet. [I forgot to mention Mealy machines and I/O automata.]

Oops

I implied in this thread that a monad structure was inevitable for functionally encoded IO, but I know that's not true. The point I stand by is that certain functional encodings, like the alternative you suggested for getChar, are monadic whether you make the structure explicit or not. There are other functional encodings (arrows being an example) that do not yield monads. Such a non-monadic encoding might have desirable properties, such as only specifying a partial order for the effects.

On the other hand, it may

On the other hand, it may well be missing others. An arrow without application (one with yields a monad) doesn't support higher-order computations - one might quite reasonably feel that's 'less FP' than the monadic approach.

Aesthetics

I don't fully understand your comment. If arrows are a generalization of monads, I don't think you can reasonably state there fundamentally is a big difference in properties, they support higher-order computations just as well, or as bad, as monads.

Apart from theoretical observations relating to Kleisli and others. The monadic approach can be seen as a step towards imperative, maybe OO, programming. Arrows can be seen as a step towards component based programming. It is more of an aesthetic argument: arrows are a natural step from functions which take input and deliver output to components which take input and deliver output, so they fit FP better.

Actual design decisions

As a generalisation of monads, we can fully expect them to be weaker than monads - and they are. Arrows without the ArrowApply class or an equivalent do not support higher-order computation. In fact this was the entire point in having them, because it makes the structure of a computation decidable and thus makes techniques like self-optimising combinators viable.

If you just want the different combinator set with the same strength that's fine, but then the difference amounts to whether you want your focus to be on values or their transformers. Funnily enough, it's the sugar for arrows that always struck me as most C-like - you may as well replace -< with a pair of parens for a function call.

Well...

Can you give an example?

Of what? The first example

Of what? The first example of an arrow that isn't a monad that comes to mind is the set of self-optimising parsing combinators that helped inspire arrows in the first place. Another would be combinators for circuit design, as in the absence of an FPGA you can't have your circuit build more circuits to plug into itself! I have to admit that I tend to look to applicative functors first when I want to build something self-optimising, as my own taste leans towards making values easier to work with.

Not supporting HOFs


Of what?

Arrows not supporting higher-order functions.


The first example of an arrow that isn't a monad that comes to mind is the set of self-optimising parsing combinators that helped inspire arrows in the first place.

Hmpf, I think the literature on arrows is a bit older than that. The syntax closely relates to stream processing functions for modelling reactive systems, and that goes well before the fifties even.


Another would be combinators for circuit design, as in the absence of an FPGA you can't have your circuit build more circuits to plug into itself!

Ni hau are the only two words Chinese I know.


I have to admit that I tend to look to applicative functors first when I want to build something self-optimising, as my own taste leans towards making values easier to work with.

We were discussing alternatives for IO.

Of what? Arrows not

Of what?

Arrows not supporting higher-order functions.

The term I used was higher-order computations and I picked it for a reason. This is rather simple though, it's a matter of what the axioms allow you. For an arbitrary arrow ~>, you can't take a Unit ~> (Unit ~> Int) and turn it into a Unit ~> Int. You can operate on the value all you like, same as any other type, but you can never 'run' it. To do that you need an "arrow with application", as found in the ArrowApply class.

Hmpf, I think the literature on arrows is a bit older than that. The syntax closely relates to stream processing functions for modelling reactive systems, and that goes well before the fifties even.

Yet they weren't actually formulated as arrows. It's not unfair to name the first things that were.

Another would be combinators for circuit design, as in the absence of an FPGA you can't have your circuit build more circuits to plug into itself!

Ni hau are the only two words Chinese I know.

Har har. You asked for examples of arrows not supporting higher-order computations. The things I have listed are arrows and do not support higher-order computations.

I have to admit that I tend to look to applicative functors first when I want to build something self-optimising, as my own taste leans towards making values easier to work with.

We were discussing alternatives for IO.

Applicative functors are yet another alternative in approximately the same part of the design space as monads and arrows, and the facet I was commenting on is perfectly relevant to IO-specific instances. Incidentally, applicative functors don't support higher-order computation either. They would if you added join, but then you would have a monad.

In the meantime, if you'd like to understand what I'm telling you I'd appreciate it if you stopped trying to call me on topicality. In case you hadn't noticed, we were already discussing general properties of an alternative to monads which you happen to like for IO.

Good, HOCs then


The term I used was higher-order computations and I picked it for a reason. This is rather simple though, it's a matter of what the axioms allow you. For an arbitrary arrow ~>, you can't take a Unit ~> (Unit ~> Int) and turn it into a Unit ~> Int. You can operate on the value all you like, same as any other type, but you can never 'run' it. To do that you need an "arrow with application", as found in the ArrowApply class.

So, what's the difference with an IO monad? Second, you are thinking in terms of Haskell libraries. I don't care.


In the meantime, if you'd like to understand what I'm telling you I'd appreciate it if you stopped trying to call me on topicality. In case you hadn't noticed, we were already discussing general properties of an alternative to monads which you happen to like for IO.

Self-optimising wasn't part of the discussion, yet. Yes it is easy to write self-optimizers for arrow/component-like structures. But, that's step two.

Meanwhile, we are walking down the wrong path. Components are a valid manner of dealing with IO, you can extend a language with a syntax for components. There is no reason why components could not accept components as arguments too; you can run them if you allow some kind of 'plumbing' component (mathematically similar to a run).

You are thinking math/Haskell libraries; I am thinking language design.

The dangers of abusing terminology

Apparently you're not actually talking about arrows at all! Which is unfortunate, because it's surprisingly hard to get away from them entirely.

So, what's the difference with an IO monad?

With the monad, I can turn IO (IO Int) into an IO Int no problem, I'm just not supposed to go from IO Int to Int. As I just stated, you can't do the equivalent of IO (IO Int) -> IO Int in an arrow unless it supports application - at which point it's equivalent to a monad anyway. Perhaps you missed that ~> occurs twice in Unit ~> (Unit ~> Int)?

Notice that when the arrow doesn't have application, it doesn't have currying either. You see what I mean about arrows without application not feeling too much like FP?

Second, you are thinking in terms of Haskell libraries. I don't care.

No, I'm illustrating in terms of them. Perhaps you missed the grand tradition of embedding languages in Haskell, and need telling that those libraries are in fact language design examples?

Self-optimising wasn't part of the discussion, yet.

Discussion doesn't work like that.

It appears to have gone completely over your head, but I've been saying throughout that the interesting cases with arrows and their ilk occur with design decisions that in many regards feel less like FP, not more. You haven't even started to deal with this, because you've spent your time getting confused courtesy of your own terminological abuse and complaining about the fact I'm using examples and terminology taken from an existing language in a discussion about language design!

Meanwhile, we are walking down the wrong path. Components are a valid manner of dealing with IO, you can extend a language with a syntax for components. There is no reason why components could not accept components as arguments too; you can run them if you allow some kind of 'plumbing' component (mathematically similar to a run).

You mean exactly like the apply operation supported by arrows with application, which I mentioned earlier? Damn, I totally missed that.

You are thinking math/Haskell libraries; I am thinking language design.

I'm thinking the construct you - and especially I! - named. But there are significant ramifications for language design, because an arrow with application is equivalent to a monad. The experience in the Haskell community is that if you want to write with named variables then the arrow-based code looks every bit as imperative and arguably an awful lot fuglier. If you want to write point-free code, each of the arrow combinators has a monadic equivalent - as it happens, these can be built from >>= and the combinator instances for the function arrow.

If you want your component-based language to differ significantly from the 'imperative' approach you dislike with monads, you don't want higher-order computations - you want to trade them off for something else. You definitely can't have turing completeness, higher-order computations with free access to pure computation and easy self-optimisation all together in one language - you run up against the halting problem hard.

Programming languages are maths in action - if you're thinking language design then you're thinking maths whether you like it or not and all that's left is whether you're misusing words to describe something else. If you have a 'component' structure in mind that's substantially unlike arrows then I'm all ears, but be aware that there will be serious design tradeoffs involved.

Enough arrows

I gave arrows as one of many viable alternatives for IO, nothing more. I wasn't thinking about how they are implemented in Haskell, just as one possible solution and, true, possibly as a semantics for a component based language. Given that, the exact relation between arrows and monads is probably not even interesting.

The rest I was already well aware of.

I was discussing arrows in

I was discussing arrows in much the same light as you were, and where I mentioned specific typeclasses it was as an alternative name for specific flavours of arrows with extra structure - Haskell has otherwise been irrelevant except perhaps as an example.

I raised the relationship between monads and arrows precisely because it gives you a choice between something that's little more than a different syntax and something significantly different - certainly I find it useful to know where self-optimisation or various analyses are practical.

Looks like this has been a waste of both our time, though. Good luck finding something that works for you.

Understanding dissections

Now, monads in Haskell are named after the monads from category theory. Some people, like 'Matt M' believe there is a deep correspondence, I believe that there is a shallow correspondence.

Huh? When did I say anything about how faithfully monads in Haskell correspond to mathematical monads? I'm not sure I understand what you meant there.

My position is that IO can be modeled functionally without ever even knowing what a monad is, but that the obvious ways to model it functionally will exhibit a monadic structure whether you make it explicit or not and whether you know it or not. I personally think that monads are a useful abstraction to make explicit for reasons like those sigfpe mentioned. Edit: But let me also add that the monadic abstraction isn't the only abstraction that applies to IO, nor is there anything mystical about the fact that IO is a monad.

Finally, it may surprise you to learn that I personally don't think the syntactic support for monads in Haskell is ideal - the syntactic support I favor (e.g. for IO) is tied to a richer structure.

Dissecting dissections


Huh? When did I say anything about how faithfully monads in Haskell correspond to mathematical monads? I'm not sure I understand what you meant there.

If you explain/understand a design pattern in terms of a categorical structure you at least see a deeper correspondence than I see. That's all.

Well

def sequential: search a c -> (a -> search b c) -> search b c

That looks pretty much like a monadic bind. Is your point just that you'd rather not explicitly call out the monadic structure, or that you think the right structure isn't monadic?

I think you could answer most of my questions by just giving me the type of 'main'. What's the type of the top level program that interacts with the OS?

If I squeeze my eyes I see...


That looks pretty much like a monadic bind. Is your point just that you'd rather not explicitly call out the monadic structure, or that you think the right structure isn't monadic?

It also looks like a type. And the term is reducible to SKI. And I also see some higher-order pattern in it. And the search has two arguments, so it must be some special other thing than a monad. [I am being tired here, not sarcastic. No, offence.]

Why would I call it a monad because exactly one operation out of a dozen or more looks like a bind? Sorry, I don't like to see ghosts.

As I said before in another post, I find the analogy too shallow and the metaphor more misleading than illuminating.


I think you could answer most of my questions by just giving me the type of 'main'. What's the type of the top level program that interacts with the OS?

'IO', or 'IO a', or even 'IO a b c', or 'Program'. Who cares? As I stated, you can model it like you want, as long as a few invariants are maintained.

[Oh, if you're asking for the type of main for the program above: It is an impure, eager, language. So, no monads in sight.]

I'm sorry...

I'm sorry, but I think you've completely missed the point. This (and the rest of the posts in this thread) focus completely on your opinion of monads themselves, as a pattern, tool, technique, whatever. But Kay's post does not claim that monads are either good or bad! Whether they are good or bad is not the point! The point is simply an observation about the process by which a solution to a particular problem is turned into an "orientation."

Do you disagree with that observation in general?

Honestly, I don't really care what you think of monads. I have no horse in that race (to tell the truth, I'm rather against the whole sport of horse racing).

I didn't read it as such, that's true

Oh, right. No, I really agree with that observation. Yeah, I really think a lot of monadic stuff is more of a cultural issue. 'Look I find a gem, don't you like it.' And then all people start to share intellectual gems. Much in the same way as new languages are adopted. So yeah, it really is an early-adopters game.

I guess to some extend science works like that. A crowd picks up an idea, looks at all sides of it, and then drops it again.

[And no, I don't care to much about monads too.]

[And I guess they really have a value in axiomatic semantics.]

Frankly, I think a lot of


Frankly, I think a lot of technically-fascinated people would prefer to ignore cultural processes entirely, and retreat to a (fantasy) world where all change is driven by (and maybe even synonymous with) absolute technical progress.

What do you mean? I see that there is a lot of monad fanboy-ism going around; I can also see that it has its use, though. Dissecting fanboy-ism doesn't mean one lives in a fantasy world. Fanboy-ism just leads to poor, sometimes costly, decisions.

I am getting old I guess. I want my information given concisely and to the point, and preferably not have my time wasted on bad interpretations or irrelevant metaphors [as I find them in 1001 explanation attempts]. That's no fantasy world, that's the real world.

Thanks a lot

This wiki explain it much better than the (too many) articles I've read which focus on the math part of monads too much and made my brain hurt..

I have not finished entirely the wiki yet, but it's already much clearer for me than the rest.
I feel a little 'cheated' though, I understand finally that monads are just a way to compose functions which have a 'shared' thing and that as Haskell makes explicit the 'world state' as a parameter, it need this kind of composition, because doing by hand would be too ugly.

What about Calc?

Coming up with externally meaningful natural language names for very general abstractions can be problematic. This is more of a limitation on our natural language, than on the formalisms which do a very good job of abstracting common patterns.

I personally would like it more if they would replace 'Monad' with 'AbstractCalculation' or just 'Calc'.

[Also, you said: That may be true for individual monads, but it's not true for more general use of monads. If you doubt this, I recommend trying to implement the modular semantics approaches described in Modular denotational semantics for compiler construction or the related thesis and paper by Sheng Liang et al., while avoiding type classes or a similar mechanism.

Note that I implemented parser combinators using the combinators of that ML library I referred to in the original post. I don't think it would be difficult to implement the monads of that thesis, at least I don't see a reason why that wouldn't work as well as the parsing library. No type classes needed.]

Credible alternatives in short supply

Note that I implemented parser combinators using the combinators of that ML library I referred to in the original post. I don't think it would be difficult to implement the monads of that thesis, at least I don't see a reason why that wouldn't work as well as the parsing library. No type classes needed.]

I doubt that. I'm responding to this now because I think it has some relevance to the discussion currently continuing elsewhere in the thread.

I don't think the parser combinator example demonstrates that monads can be easily replaced by ordinary combinators in general. While it's possible in certain cases, that misses the point of monads, monad transformers, and do notation as a generic framework for modular composition of code, particularly effectful code and certain other kinds of code, in an otherwise pure language like Haskell.

The reason I mentioned the Liang/Hudak/Jones paper is that it demonstrates monads being composed, using monad transformers, to achieve modularity and composability for different kinds of effects in the same program. This is otherwise not so easily achievable in a pure functional program.

In the paper, interpreters with particular semantics are constructed by building a single monad out of a base monad and a set of monad transformers that implement features such as state, environments, exceptions, continuations etc. Being able to compose these features in various ways without a lot of custom code is more difficult than it might seem. That's why I suggested trying to implement the examples in the paper. There's not that much code, so it ought to be a manageable exercise — but without monads, it's a very challenging one.

In the more general Haskell context, this approach has resulted in the ability to implement a large variety of custom monads by combining a small set of standard monad transformers. This often requires absolutely no code specific to the composed monad - the transformers and type inference can do all the work. This provides a way to manage multiple different kinds of effects in the same program, without requiring that a new, custom monad - or other custom infrastructure - be constructed from scratch for any of the combinatorial explosion of effect types that programs need.

All of this goes well beyond the core mathematical definition of a monad, but it's the properties of that definition that make this larger framework possible, along with type classes to support the required genericity to help make it all usable. Implementing and using individual monads is trivial, in any reasonable language - it's composing them and using multiple monads in the same program that raises the need for transformers and something like type classes.

In this broader context, the simplicity of the core mathematical definition of a monad is deceptive - there's a lot more functionality than that built into the overall supporting framework, and it all contributes to the expressive power that monads provide.

To achieve anything like this with ordinary combinators, without simply giving up purity, you're either going to end up with an ad-hoc approach in each case — which was the very problem monads and monad transformers were originally employed to solve — or else you'll be building some other, competing framework to address these issues.

We don't seem to be actually discussing such an alternative framework, which makes some of the comments that imply that monads aren't really needed, or are some sort of cultural fixation, rather puzzling. My current impression is that this is a perspective which comes from working primarily with impure languages, and perhaps not fully appreciating the constraints and issues that purity introduces, the limited range of solutions currently available for addressing these issues, and the success which monads have had in addressing them.

dusty cuneiform decks

We (our species) made some progress when we were able to abstract from, e.g., a group of n goats to the number n which captured their cardinality. I wonder if early goatherds used to complain that it shouldn't be necessary to understand "integers", and by extension, number theory, just to herd goats?

Way back in the bronze age, people who counted things like jugs of wine or loaves of bread or other things that stayed where you put them liked to make nice rectangles and then count them off in groups of six or twelve. People who counted things like cows or goats that didn't stay put so nicely preferred to wait until they were moving through a gate, and then use their fingers to count them off in groups of five or ten.

These traditionally different abstractions for cardinality may be why it's still possible in this day and age to buy incommensurable packs of hot dogs and buns.

ObLTU: some customs are remarkably resistant to being abstracted away beyond necessity

A monad - last comment

I guess I should have elaborated on the evaluation order. A monad has very precise semantics, they are given below.

Definition 1. A monad M is a type constructor, together with two operations:

then : M a -> (a -> M b) -> M b
return : a -> M a

satisfying the following laws:

(return a) then k = k a          (left unit)
c then return = c                (right unit)
c1 then \v1.(c2 then \v2.c3) = (c1 then \v1.c2) then \v2.c3
                                 (associativity)

That's it. Nothing more. Anything satisfying those laws is a monad. It says nothing about evaluation order, it says nothing about how M would look like, it says nothing about type classes, it says nothing about IO, it's just a type constructor satisfying certain laws. [Or, any type constructor satisfying these laws is a monad]

From a programmers perspective, I would say it is hardly interesting. It's a manner of dealing with hidden state. It gives a guideline to implement trivial embeddings for dealing with encapsulated state in terms of some injection operation into that state (return), and a manipulation operation [for chaining actions] on the state (then).

There are many, many, many 'ADTs', or combinatorial libraries, which are more easily written with different/more-specialized injection operations and other/more-specialized updates on the hidden state (not to mention that it is often also really nice if one can observe/extract specific parts of the state).

An analogy: it's the list example of recursive data structures.

[And I'll never ever make a comment on monads from this point on. Yuck! ;-)]

Monads as Finite State Machines

I'm surprised this association hasn't been used to describe monads to imperative programmers. It seems like a much more natural fit to their state-based mindset.

A monad looks to me looks like a finite state machine (blackbox), defined by a set of state transition functions. Bind is a way to sample the outputs of the blackbox FSM and trigger state transitions.

The 'left identity' law defines the initialization instruction of the FSM, 'right identity' says the initialization instruction is a no-op for an initialized FSM, and the 'associativity law' states that sequentially binding two state sampling functions is equivalent to creating a new sampling function whose body sequentially applies the two sampling functions (which seems obvious).

Seems like a natural place to start for an explanation of monads.