What is a monad, why should I use it, and when is it appropriate?

What is a monad, why should I use it, and when is it appropriate?

Can anyone answer this in a post without showing code? And if possible, relate it to OO concepts like classes and delegates.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

As long as we're asking questions...

...and since we seem to be getting into foundational mathematics these days... Where do monads crop up in classical mathematics?

And while I'm at it, what is the distinction between the Haskell Maybe monad, and the union type of ML?

   datatype 'a maybe = Some of 'a | Nothing

Mathematical Monads and the Maybe Monad

Wikipedia says that a 'monad' in mathematics can refer to something from both non-standard analysis and something from category theory. I'm only familiar with the latter usage of 'monad' in mathematics, and only to some degree; I've only recently begun to try to study category theory. In category theory, Monads have a relationship to concepts like Functors. I won't try to explain more since I'd probably put my foot in my mouth... However, there are some books on the topic which may be beneficial, including a quite short book from Benjamin Pierce on category theory as relevant to computer science.

As for the ML type (I assume you mean something like 'option'), the only difference between that and Haskell's Maybe monad is that, in addition to defining an ADT, there is also defined a set of functions over the type.

You need to understand type classes in Haskell to understand exactly how this works, but the simplified account is that you have two polymorphic higher-order functions: one called 'return', and one called 'bind', the latter usually denoted as the infix operator (>>=). By using the function 'return' and 'bind' in certain combinations with other functions, you can represent a composite computation which may produce an 'option' or 'Maybe' result at various intermediate stages. If at one of those stages the computation fails, or you have a result of something like 'none,' then you can cascade this failure across the remaining stages of computation without having to explicitly deal with possible failure at every single stage.

For example, say you wanted to add 3 potential numbers, all of which were embedded in the 'option' type. Without a monad, you'd have to check each potential number before adding it to the others, to see whether it was an actual number first; if it wasn't, you'd set the result of the entire addition expression to 'none'. In the Maybe monad, you don't have to do this. You just specify the entire computation up front, without any explitic nested failure checking. It is the responsibility of the higher-order functions 'return' and 'bind' to "handle the details" of the failure implicitly. If failure does occur somewhere early on in a complex computation, the effect of this will cascade in a safe and predictable way.

The Monad interface is only one way

You can't get things out of the Maybe Monad (without additional operations), but you can get from the Maybe datatype:


data Maybe a = Just a | Nothing

instance Monad Maybe where
    Nothing >>= _  = Nothing
    (Just x) >>= k = k x
    return = Just

maybe :: b -> (a -> b) -> Maybe a -> b
maybe n f Nothing  = n
maybe n f (Just x) = f x


maybeM :: (Monad m) => b -> (a -> b) -> m a -> b
maybeM = error "impossible"

The Maybe datatype is a monad

You can't get things out of the Maybe Monad (without additional operations), but you can get from the Maybe datatype:

I'm not sure I take your meaning. I would say that the Maybe type constructor is a monad; there's no distinction between the "maybe monad" and the "maybe datatype".

It's true that the monad operations don't provide any way out of the monad, but every notable monad supports additional operations. Every monad in Haskell has to provide a way to construct a transformation of the type ∀a.m a → IO a, or else it can't be used by main or any function called by main. Most of them also provide a way to construct a transformation of type ∀a.m a → a; the only exception I can think of is IO.

Triples

I'm not sure I take your meaning. I would say that the Maybe type constructor is a monad; there's no distinction between the "maybe monad" and the "maybe datatype".

No.

It is common to identify the monad with the functor/type constructor as both mathematicians and computer scientists are wont to do with nearly everything, but, as the alternate name "triple" for monad suggests, there is more to a monad than the type constructor. A monad is a triple (T,return,join) that satisfies some laws. To emphasize this point, Maybe is a monad in at least two ways (but not a "notion of computation" a la Moggi, a strong monad with a mono return, in the following); besides the typical way, the definition:
return = Nothing; m >>= f = Nothing also, though I haven't listed them, satisfies the monad laws (trivially).

Rats

I'm not sure I take your meaning. I would say that the Maybe type constructor is a monad; there's no distinction between the "maybe monad" and the "maybe datatype".

No.

Oops. I was thinking about Haskell, where you get at most one instance of Monad per type, but, as you say, monads are more general.

I should have remembered that, because I noticed the same thing when I was working with error monads, which can be defined as monads two ways. For example, Haskell's Either type constructor has two monads: (Either a, Right, either Left id) and (Λa.Either a b, Left, either id Right).

Maybe in Haskell is the same as in ML

[W]hat is the distinction between the Haskell Maybe monad, and the union type of ML?

None. You can define return and bind in ML just as easily as in Haskell. The only differences are, (1) Haskell overloads the operation names using type classes, and (2) Haskell's do-notation provides a convenient way to write monadic code.

Monads in classical mathematics

Where do monads crop up in classical mathematics?

There is a one-to-one, onto relation between finitary monads on the category Set and finitary algebraic theories on Set. This means that all of the ubiquitous algebraic varieties, like monoids, groups, rings and modules are monads (but not fields, because division is partial). The unit operation says what a variable is; the multiplication operation encodes substitution modulo the laws. This means that whenever you want to talk about algebraic theories in a way which does not depend on their syntax/presentation (choice of operators or laws) monads are the appropriate tool. And if you want to talk about properties of ALL algebraic theories, then you can do it in an abstract fashion by studying properties of monads in general.

In poset theory, a monad is a closure operation, so when you talk about order-theoretic fixpoints you are talking about monads. And these are closely related to Galois connections, which are pervasive.

And while I'm at it, what is the distinction between the Haskell Maybe monad, and the union type of ML?

ML has no (primitive) union type. It has a way to form (almost) finitary coproducts, more or less as Haskell does. It is because people subscribe to oversimplifications like these that they end up frustrating themselves, trying to apply set theory where it isn't sound.

ML union


ML has no (primitive) union type

Could you please explain why this:


datatype A = L of a1 | R of a2;

is not a union type ? How is it different from what Luca Cardelli defines as the union type in his "Type Systems" (1997):

"An element of a union type A=A1+A2 can be thought of as an element of A1 tagged with a left token [...], or an element of A2 tagged with a right token [...]."


trying to apply set theory where it isn't sound

What exacly do you mean by that ? Could you give an example of such application ?

Answers

  1. A monad (in the FP sense) is a way to structure parts of purely functional programs so that the functions involved are applied in a particular sequence
  2. You should use it because it makes programming in a purely functional languages a much nicer experience when dealing with things in which sequence matters
  3. It is appropriate whenever the sequence that certain operations are carried out determines the correctness of the results, such as when you are dealing with mutable state

The theme, as you've no doubt noticed, is sequence. Imperative languages have it. Purely functional languages nominally don't. Monads allow you to recapture the ability to perform sequenced operations where necessary, while still using the (easier to reason about) functional approach.

At least, that's my understanding of things. But I have to admit that I haven't actually done more than a bit of dabbling with monads in Haskell. I'm sure someone will correct me if I'm way off base here.

Monads as Abstractions of Computational Characteristics

I essentially agree with the parent poster on the points raised. I think there is another, perhaps more general way of looking at monads as well though, and that is by regarding them as a way of specifying a very particular type of computational abstraction.

To understand what I mean by this, it is beneficial to looking at the types of monads already defined in the Haskell libraries. Consider the following:

  • Error Monad - computations which may throw exceptions or produce well-known types of errors.
  • IO Monad - computations which may depend on the success of input or output, or which may refer to values operating beyond the scope of the nominal purely functional evaluation.
  • Maybe Monad - computations which may fail.
  • Reader Monad - computations which maintain an environment (e.g., a set of lexically scoped variable bindings).
  • State Monad - computations which may utilize mutable state (more general than the Reader monad)
  • Writer Monad - computations which may produce some sort of additional 'notification' kind of output in addition to the nominal result.
  • ...

From the above, it can be seen that each monad describes a sort of computation with particular characteristics. The monad abstraction facility is used to ensure that computations of a certain sort maintain a consistent handling of these characteristics across a program in a way that is verifiable by the type checker. Thus, monads provide a way to encode the behavior of computations themselves in the type system of the language.

As a side note, it makes even more sense to look at monads this way when you consider them as a special case of arrows, which are an even more flexible way of specifying the characteristcs of a computational style in the type system.

When do you use monads? Well, it depends. Some languages only expose certain functionality through a monadic interface, e.g., Haskell requires you to use the IO Monad to perform IO. In many other cases where monads are used, they are often not strictly necessary, but using them allows for a very neat and consistent way of handling certain styles of computational abstraction. For example, one may perform computations which are context sensitive by constantly passing the context around to different functions as an explicit parameter. Such a case, however, can be represented much more elegantly as a type of monadic computation in which the monad abstraction (e.g., Reader) handles the contextual details of the computation. By using a monad, a simplified interface to the necessary functionality can be provided, while the hard work of maintaining and passing the context is handled behind the scenes.

Thanks, that is the best

Thanks, that is the best explanation I've gotten so far. Most of the answers seemed to assume that the reader understands the basics of Haskel or have a degree in advanced math. Some follow up questions…

Without these monads, is there a way to control the sequence in which calculations are performed? Or is that strictly decided by the compiler/interpreter?

Well, yes... but

Without these monads, is there a way to control the sequence in which calculations are performed? Or is that strictly decided by the compiler/interpreter?

In a lazy functional language, the evaluation order of functions is deliberately left unspecified. That said, there are various things you can do within the framework of function application to ensure that function evaluations are ordered, by making the result of one function depend on the value returned by a second function (which must then be evaluated before the first function can be evaluated). Ultimately though, this can get pretty hairy, and what you end up doing is pretty much exactly what a monad does, except in an ad hoc fashion.

Monads provide a generalized, structured way to approach this problem. Understand, monads don't add anything new to an FP language, in terms of altering the semantics of the language. They're just a (to steal a term from another recent story) rather convenient design pattern. With the added advantage that the pattern in question obeys certain useful algebraic laws that make it easy to compose with other monads in interesting ways. Languages like Haskell have provided some convenient syntactic sugar to make this design pattern even easier to use. The resulting code ends up looking suspiciously like imperative code, but is really functional "under the hood".

Disclaimer:The preceding is, again, based on my own rather limited understanding of monads and their use in FP languages.

Links & quotes

For some coverage of monads in general, see the previous thread Understanding Monads. There's also this Collection of links to monad implementations in various languages. Finally, the new thread Everything Your Professor Failed to Tell You About Functional Programming links to an article which includes an explanation of monads in terms of Java/C#-style interfaces.

As the last link above points out, monads can be seen as a kind of design pattern which is useful in the context of functional programming (although not all design patterns have such precisely defined properties). Quoting from the abstract of the first paper above, monads "provide a convenient framework for simulating effects found in other languages, such as global state, exception handling, output, or non-determinism."

A monad

is just a strategy for combining computations.. so consequently there are a lot of uses.. all of them needing some kind of structuring of a computation

But since you are probably asking this from a haskell perspective you should just use it all the time :) .. no real way around that in haskell (at least if you want to use the libraries) and in haskell it is presumed that monads are usable for almost anything...

You could encode a class to act like a monad, heck even a record with a consistently used set of accessor functions is a faithfull implementation of the Monad.. just as long as they obey the following laws.

* If i have an apple (a), i can get a box of apples (m a)
* If you will give me a banana for each apple (a -> b), and I have a box of apples (m a), then I can get a box of bananas (m b).
* If I have a box of boxes of apples (m (m a)) then I can take the apples from each, and put them in a new box (m a).

a,b ofcourse stand for basic types of the language
(a -> b) is a function from a's to b's ( f(x) = x*x )
m is the monad (container) (the record/class if you will)

Those aren't really laws

That's just the interface. It's important that the box of apples you get from return is a fresh box, and that the new box you get from join (aka mult) is somehow the combination of the boxes you had before.

The monad laws are:

  • if you have a box of apples (m a) and get a fresh box of boxes of apples (m (m a)) and then combine all the boxes back into a box of apples, you get (the equivalent of) your original box of apples
  • if you have a box of apples (m a) and, since I will give you a fresh box of apples for each apple (a -> m a), you get from me a box of fresh boxes of apples, and then combine all the boxes back into a box of apples, you get your original box of apples
  • if you
    1. have a box of boxes of boxes of apples (m (m (m a))), and you take all the boxes of apples from each, combine all the now empty boxes into a new box and put the boxes of apples back in to get a box of boxes of apples, and then take out all the apples, combine all the boxes, and put the apples in, you get a box of apples which is exactly the same as if you
    2. have a box of boxes of boxes of apples (m (m (m a))), and since I will give you a box of apples for each box of boxes of apples (m (m a) -> m a), you get from me a box of boxes of apples, and then take all the apples and combine all the boxes to get a box of apples

What's the tune?

I think that explanation would be really quite seriously terrific if it were animated, like the classic Project Mathematics stuff. Anybody reading this any good with Flash?

Data types and computation types

One reason that monads confuse is that people expect to find a single thread running through all of the examples, but in fact there are two quite different sorts of motivation at play.

The first class of examples are ones that use categories essentially as abstract datatypes, where morphisms stand as constructors and projection functions: essentially trivial computationally. Here, the right intuition to have is that monads represent container types, with the monad most naturally represented using natural transformations, the unit function giving a singleton collection, and the concat function flattening a collection of collections to a single collection. Monadic parser combinators is a pretty good way of showcasing this style of monad.

The second class of examples depends upon a more sophisticated intuitiuon for modelling computation categories, using Moggi's computational lambda calculus. While the underlying category we use still represents data values, we are interested in monads representing computations that return those values. Here the Kleisli category of maps from computations to values is what we are most interested in, and we change to a different representation of monads, using return and bind syntax, that talks about the Kleisli category. The examples Darin Morrison gave from the haskell library are in this mold; also nice to consider are the continuation monad, and the idea behind the Harris-Marlow-Peyton-Jones-Herlihy paper Composable memory transactions.

What makes not distinguishing between these classes of monad tricky is that if you apply intuitions from one to another there is a risk of being confused, because the relationship of morphisms to programs is different in each case. In the first case, all of the morphisms you are interested in are exactly represented by programs, in the second case the morphisms in the Kleisli category represent "what is going on at runtime" and so are not possible for the programmer to represent beforehand.

Just in case I haven't lost most people, I'll get a bit mathematical here. The two classes of monad also tend to draw on different mathematical aspects of monads. Categories have two kinds of structure: they have a partial-order structure which tells you which objects are connected to which other objects, and they have an algebraic structure, which tells you how many morphisms each object has to itself. Similarly, there is an algebraic class of monad linked to the algebriac structure of categories: one of the canonical mathematical examples of a monad is a functor from raising sets to sets representing the free monoid over those sets: the structure here corresponds to the first kind of example. The second kind arises over partial orders when we notice that closure operations are monads: it's a bit more tricky to express what is going on here, but they correspond to the second kind of monad.

Thank you

While I've read several things that say "monads are more than one thing" I think the extra note that it thus can easily lead to confusion helps it stay more clear in my mind.

Computation types as data types

The second class of examples can also fit in the first class. We may think of programming effectful computations with monads as some sort of meta-programming where the abstract data type represents a computation returning a value of some type.

To put it simply (hopefully), unit takes a value and returns a chunk of code that produces that value. Function bind takes two parameters, a code chunk and a code chunk generator (a function). The result (also a code chunk) reminds of run-time code generation: the result of the first chunk (when it is run) will steer the generation of the remaining code chunk by the second parameter.

In the case of the IO monad we build the pieces of code that do IO using the do notation, bind, unit and what have you. Then, we tell the IO interpreter to run our produced IO code by placing it in main.

No, not really

The first class of example treats universal constructions that don't leave anything unsaid. It's quite true that it's possible to implement, say, the IO monad by a type that takes a stream of input events to the output value and a strem of output events, but this type will not be a universal construction: it just doesn't live in the same universe as the first example. When one realises the IO monad, the code will be linked at run time to OS services that provide and execute these event streams, binding, if you like, a free variable. There is no such binding to be done in the first case, everything is fixed by the semantics of the core of the language.

A very simple definition

Here's a definition of "monad" that's been bouncing around in my head
for a while:

"A monad is an interpreter mod."

I mean 'mod' in the sense of 'modification', like for a car or a game.
And by 'interpreter', I mean the conceptual central engine that moves
computation forward, even when we're talking about a compiled
language.

A monad lets you customize the way your program is run.

This definition is a *practical* definition; it ignores the precise
definition of a monad, both in pure math and in its Haskell
implementation.

Some friends here at work, who love Scheme, have been asking me what a
monad is, and they understand what it means to, for example, thread
some state through a computation. But as for other monads, they
aren't sure what they are really for. Even when you explain what they
can do, you haven't really provided a motivation for the concept.

So, at the most basic level, what does a monad do? It lets you
customize the implementation of the language to impose an ordering, or
thread state, or to thread several pieces of state, or to back up and
re-try changes to state (composable memory transactions), etc.

Of course, it doesn't *really* change the implementation, but when you
create a monad, it can feel like you are doing so.

For example, let's say you wanted to run some code, but you also want
to count the number of times you use '+'. Well, a Scheme person might
first think of writing a meta-circular interpreter, and modifying it
slightly to do the counting.

If you then said that you might want to add other customizations, then
they might say that they would write a meta-circular interpreter with
places for hooks added, and you could use the hooks to install a bunch
of different customizations.

Conceptually (but not actually), a scheme program is run by an
interpreter that takes (f a b c) and applies f to a, b, and c.
There's other stuff (evaluating a, b, c; scoping issues; macros), but
this action of applying f is the main event. It's the first thing you
need to understand when you're learning something like Scheme for the
first time. A function applied to some arguments.

So, if I said, "hey, I have a cool implementation of Scheme where you
can run hooks before and after tha application of f", then you might
think that that's pretty powerful, and could be used for lots of
different purposes.

A Haskell monad serves some of these purposes, and it does it
brilliantly and cleanly, using the Monad type class. It has a precise
definition that helps you plane your "interpreter mod" carefully, and
enforces safety and coherency. It's not the same as a custom
interpreter; it can do things that don't fall under the category of
"interpreter mod"; it isn't quite as straightfoward as just swapping
in a new interpreter -- you have to create types that fit the code
that implements the mod, and you have to go in and out of those types
properly; it's definitely not trivial to combine two monads, in
general.

So that's my definition. It doesn't really answer the question "what
is a monad?", but rather answers the question "what is a monad for?",
or "why the hell do I have to learn this tricky thing?"

This definition is like defining "continuation" as "you can freeze the
program and resume it later on". It's really imprecise; it doesn't
say what the program does after the freeze; there are still mysteries
like "what is this closure I pass to call/cc?" and "what does call/cc
return?"; it doesn't say anything about scoping, or state, or
resumption. But it definitely gets at the essence.

And I think that having a very basic definition of monad like
"interpreter mod" proves this first step, and this is useful because,
as we have seen, people keep creating new threads called "what is a
monad?" and "why do I want to know what a monad is?" and so on.

So, a monad is an interpreter mod.

Greg

For example, say you wanted

For example, say you wanted to add 3 potential numbers, all of which were embedded in the 'option' type. Without a monad, you'd have to check each potential number before adding it to the others, to see whether it was an actual number first; if it wasn't, you'd set the result of the entire addition expression to 'none'. In the Maybe monad, you don't have to do this. You just specify the entire computation up front

i never knew that! things make a lot more sense now. does someone have some example code?

thanks!

I know this one! (Maybe :-)

There's most likely a more elegant way of expressing this, but the following works:

maybeAdd3 :: (Monad m, Num a) => m a -> m a -> m a -> m a
maybeAdd3 ma mb mc = do
    a <- ma
    b <- mb
    c <- mc
    return (a+b+c)

And a test in GHCI:

*Main> maybeAdd3 (Just 1) (Just 2) (Just 3)
Just 6
*Main> maybeAdd3 (Just 1) Nothing (Just 3)
Nothing

Without the do-sugar, it looks like this:

maybeAdd3 ma mb mc =
    ma >>= \a ->
    mb >>= \b ->
    mc >>= \c -> return (a+b+c)

There's probably some nifty combinator already that can just "lift" (+) or sum into the monad, but I can't find it (liftM doesn't seem to be it). There is also a msum function but that does something entirely different — it seems to just return the first non-Nothing argument. (Of course, that's perfectly reasonable given the more general meaning of "sum", but confused me for a bit when dealing with Maybe Integers).

Nifty combinators

There's probably some nifty combinator already that can just "lift" (+) or sum into the monad, but I can't find it (liftM doesn't seem to be it).

You can use liftM2 to lift binary functions, so liftM2 (+) :: (Monad m, Num a) => m a -> m a -> m a. For a list of computations, you could use foldr (liftM2 (+)) (return 0).

Alternately, you can use sequence which computes a list given a list of computations, and then use liftM to lift sum into the monad like so: liftM sum . sequence. This is potentially less efficient, because it traverses the list twice.

There is also a msum function but that does something entirely different — it seems to just return the first non-Nothing argument.

msum is to sum as mplus is to (+). If you think of sequence as "and" (it equals Nothing if any of the sub-computations equals Nothing), msum is more like "or" (it only equals Nothing when all the sub-computations equal Nothing).

liftM2

You can use liftM2 to lift binary functions...

Ah yes. Not sure how I missed that...

It's not just for Maybes

It's worth adding that you've written maybeAdd3 in such a way that it works with any Monad. So, for example, we get

*Main> maybeAdd3 [1,-1] [2,4] [3,5]
[6,8,8,10,4,6,6,8]

which is the list of all ways to form sums of three elements, one from each list. And this is one reason why Monads are cool - they abstract over constructions like Maybe and List. Even though List and Maybe look different at first, the 'plumbing' required to form this outer product of lists is the same 'plumbing' required to bail out early from a computation with a value of Nothing.

thanks! you've no idea how

thanks! you've no idea how much confusion that just cleared up....

Maybe Example...

I don't know if this will be more enlightening or intimidating, but here's an example of the different trade offs one could make.

import Control.Monad

main = do let vals = [(1,0,-2),(0,0,0),(1,1,1)]
          print $ map quad   vals
          print $ map quad'  vals
          print $ map quad'' vals

quad  (a,b,c) = (root1,root2)
  where disc  = b*b - 4*a*c
        root1 = (-b + sqrt disc)/(2 * a)
        root2 = (-b - sqrt disc)/(2 * a)
        
quad' (a,b,c) = liftM2 (,) root1 root2
  where disc  = Just $ b*b - 4*a*c
        root1 = (-return b + (sqrt disc))/(2 * return a)
        root2 = (-return b - (sqrt disc))/(2 * return a)

quad'' (a,b,c) = if a==0 
                  then Nothing
                  else if disc<0 
                        then Nothing
                        else Just (root1,root2)
  where disc  = b*b - 4*a*c
        root1 = (-b + sqrt disc)/(2 * a)
        root2 = (-b - sqrt disc)/(2 * a)

instance Num a => Num (Maybe a) where
    (+) = liftM2 (+)
    (-) = liftM2 (-)
    (*) = liftM2 (*)
    fromInteger = Just . fromIntegral
    
instance Fractional a => Fractional (Maybe a) where
    (/) a (Just 0) = Nothing
    (/) a  b       = liftM2 (/) a b
    
instance (Floating a, Ord a) => Floating (Maybe a) where
    sqrt (Just x)  = if x<0 then Nothing else Just (sqrt x)