## Lightweight Monadic Programming in ML

Many useful programming constructions can be expressed as monads. Examples include probabilistic modeling, functional reactive programming, parsing, and information flow tracking, not to mention effectful functionality like state and I/O. In this paper, we present a type-based rewriting algorithm to make programming with arbitrary monads as easy as using ML's built-in support for state and I/O. Developers write programs using monadic values of type M t as if they were of type t, and our algorithm inserts the necessary binds, units, and monad-to-monad morphisms so that the program type checks. Our algorithm, based on Jones' qualified types, produces principal types. But principal types are sometimes problematic: the program's semantics could depend on the choice of instantiation when more than one instantiation is valid. In such situations we are able to simplify the types to remove any ambiguity but without adversely affecting typability; thus we can accept strictly more programs. Moreover, we have proved that this simplification is efficient (linear in the number of constraints) and coherent: while our algorithm induces a particular rewriting, all related rewritings will have the same semantics. We have implemented our approach for a core functional language and applied it successfully to simple examples from the domains listed above, which are used as illustrations throughout the paper.

This is an intriguing paper, with an implementation in about 2,000 lines of OCaml. I'm especially interested in its application to probabilistic computing, yielding a result related to Kiselyov and Shan's Hansei effort, but without requiring delimited continuations (not that there's anything wrong with delimited continuations). On a theoretical level, it's nice to see such a compelling example of what can be done once types are freed from the shackle of "describing how bits are laid out in memory" (another such compelling example, IMHO, is type-directed partial evaluation, but that's literally another story).

## Comment viewing options

See also the mention of the paper by its author recently on LtU. I agree this deserves a full story in itself.

### Thanks!

Somehow, even though I'd read that thread, I totally overlooked the link to this paper. Thanks for calling my attention (back) to it!

### A bit confused...

What bothers me the most is that the authors don't mention Wadler's Marriage of Monads and Effects at all. In that paper, Wadler established a translation from a computational lambda calculus for a language with state into a monadic language, but the application of monad morphisms was implicit (those were made explicit by Filinski's M^3L, which is mentioned by the author). So comparing exactly what they achieved over Wadler's work is what is mostly missing for me.

Of course, from a theoretical perspective, we should ask whether there is a principled way to get these monads and monad morphisms... Especially considering that their number grows in size exponentially to the number of effects used. :)

### Here is one way to

Here is one way to automatically derive morphisms.

Analogous to CPS transformation we can do something I will call monad transformation. It transforms a piece of code with implicit effects to insert bind/return calls as appropriate.

For example the map function:

map f [] = []
map f (x:xs) = f x : map f xs

Becomes:

map' f [] = return []
map' f (x:xs) = do x' <- f x
xs' <- map' f xs
return (x' : xs')

If we monadify it with respect to the result of the f argument. That is, the type changes from (a -> b) -> [a] -> [b] to (a -> m b) -> [a] -> m [b].

So the rules are that you insert binds for all nested function calls, and you insert a return at the end. A more precise way to say it is you transform the expression to ANF, then you replace all lets by monadic binds and wrap the end result in a return. This way you can mechanically monadify any function. The map' function is exactly the monadic map mapM in Haskell. Applying the same transformation to filter gives you filterM. It works for any function. In particular it also works for the monadic return and bind functions. For example take the bind function from the maybe monad:

x >>= f = case x of
Nothing -> Nothing
Just v -> f v

Monadifying it with respect to the x argument we get:

x >>= f = do x' <- x
case x' of
Nothing -> return Nothing
Just v -> f v

This is the maybe monad transformer, minus the MaybeT wrapper boilerplate.

Lets try the same with the list monad:

xs >>= f = concat (map f xs)

We get:

xs >>= f = do xs' <- xs
tmp <- map' f xs'
return concat tmp

And we got the list monad transformer.

The reason this works is that when you want to execute code with monadic effects in a monad A, what you want is to monad transform the code wrt A. If you then also want to have effects from monad B, you can transform the code resulting from the previous transformation again wrt B. If you work out the details of what twice monad transforming does, you will see that everywhere returnA x appeared with a single monad transformation, now returnB' x appears, where returnB' is the return function of monad B transformed wrt monad A. The same thing happens for bind. So this gives you a way to mechanically construct the return and bind functions for the combined monad: returnAB = returnB' and bindAB = bindB'.

I don't know if this method is explicitly described elsewhere. It probably is, given that this seems fairly obvious and the only way that could possibly work, but I don't know the name of it so I couldn't find it. Anyone?

### Kleisli Arrows

if we modeled normal functions as Kleisli arrows in an abstract monad, I believe we would achieve similar results. Plain old functions would then be run in the 'Identity' monad.

### That is exactly what I'm

That is exactly what I'm doing; treating returnB and bindB as Kleisli arrows in the A monad. That doesn't really tell you *how* exactly you're viewing them as Kleisli arrows, so this summary doesn't seem sufficient.

It doesn't matter which monad you run normal functions in; the monad laws imply that functions that don't use effects run the same in any monad, though the identity monad would be a natural choice. You can see that I used this to cheat a little in the map example: if you want to be consistent you have to transform the cons function : too, but because it is a pure function and cannot possibly produce an effect, you can just call it and use its result directly. This is just an optimization; you could also monadify all functions, then the map code would look like this:

map' f [] = return []
map' f (x:xs) = do x' <- f x
xs' <- map' f xs
x' :' xs'

The return would then happen inside :', which then has type Monad m => a -> [a] -> m [a]. The same goes for the concat function in the list monad bind. If you apply the transformation consistently all types a -> b become Monad m => a -> m b.

One thing that this model supports and might be interesting is return and bind functions that themselves use effects.

Another thing to think about that I've wondered out loud about here before is what the corresponding transformation for comonads looks like (all types a -> b would become c a -> b).

This monad transformation is related with and can be used to implement direct style monads like in the Representing Monads, just like CPS transformation is related with and can be used to implement call/cc.

### This is just the CBV translation into the Monadic Metalanguage

This kind of call-by-value translation appeared in, for example, Monads and Effects by Benton, Moggi and Hughs (see Figure 4 in "Metalanguages for Denotational Semantics").

This reference is definitely not the first place it appears, for example Wadler mentiones it as "just the usual typed call-by-value monad translation".

You seem to be applying it to a Haskell program, yielding another Haskell program, rather than translating a CBV language into a monadic language. Can you please explain what you achieve by it?

(Both Wadler in his marriage paper and Hicks et al. in this paper used translations from a CBV language into a monadic language, and not within Haskell.)

### What I achieve by it is

What I achieve by it is deriving a monad transformer from a monad. Link to the paper you mentioned that doesn't require you to login.

### Monad transformers: complexities and limitations

Jules Jacobs wrote:

Lets try the same with the list monad:

xs >>= f = concat (map f xs)

We get:

xs >>= f = do xs' <- xs
tmp <- map' f xs'
return concat tmp

And we got the list monad transformer.

Let us try it out:
bind xs f = do xs' <- xs
tmp <- map' f xs'
return (concat tmp)

f1 x = do
print x
return [x]

t1 = (return [1,2] bind f1) bind f1
t2 = return [1,2] bind (\x -> f1 x bind f1)
Whereas t1 prints "1 2 1 2", t2 prints "1 1 2 2". Thus bind is not associative and can't be the bind of a monad. The giveaway was the phrase about list monad transformer': there is no such thing.

I should emphasize that the monad transformer approach is inherently limiting because the layering of transformers is statically determined. In general, the composition of monad transformers is not symmetric: the order matters. There are realistic programs where no fixed order suffices because at different moments the program needs different orders. The order must be dynamic, which monad transformers don't support. The paper Delimited Dynamic Binding', Sec 4.3, has described several examples of realistic programs that require the dynamic ordering (the accompanying code archive contains the complete code). http://okmij.org/ftp/Computation/dynamic-binding.html

### I'm sure you have a much

I'm sure you have a much better understanding than me about these things, but here he goes for fun :)

I agree that this was a bad example, and the fact that this transformation still produces something that the Haskell community has produced is largely accidental. Still, the transformation is valid in a way. Say you were given a program in direct style that uses list monad operations and IO monad operations. There are two ways we could execute this: monadify it wrt the list monad and then monadify it wrt the IO monad. Or we could monadify it once wrt the derived list "monad" transformer applied to the IO monad. These two give the same result, and in both cases you indeed get unintuitive behavior of the nested let example you gave. So if you don't have monad laws the transformation cannot give you these, but it still gives you something reasonable.

In symbols:

T m p transforms program p wrt monad m
Q m1 m2 is the result of the derivation procedure I sketched applied to monad m1 wrt m2

Then the following holds:

T m2 (T m1 p) == T (Q m1 m2) p

Dynamic layering is indeed an interesting problem and I don't see how monad transformers could support that either. What you want is essentially squeezing a continuation monad between a reader monad, but then the two halves of the reader cannot communicate with each other. Monad transformation puts one monad next to another, perhaps there is a similar operation that puts a monad inside another, allowing it to decide dynamically which monad to put on top. Or maybe I am completely wrong, dynamic layering is something I know I don't understand (as opposed to the static layering above, which I think I understand but probably don't).

Do you know other examples than the reader monad that can be dynamically layered in an interesting way?

### Deriving morphisms

Here is one way to automatically derive morphisms.

I guess I should clarify what I meant by "a principled way to get these monads and monad morphisms".

Lets assume the effects we have are:

get, put, throw.

Then, for each of the 8 subsets:
{}, {get}, {put}, {throw}, {get, put}, {put, throw}, {get, throw}, {get, put, throw}
Id, Environment, Writer (with suitable monoid), Error, State, ErrorT (Writer), ErrorT (Environment), ErrorT(State)
respectively.

This choice is obvious in this case, but it isn't obvious in general: if I have a monad that supports the effects: Eff1, Eff2, ..., Eff n, what is the monad corresponding to an arbitrary subset of these effects: {Eff i1, ..., Eff ik}?

This question was posed by Wadler in the conclusions section of his Marriage of Monads and Effects. Various people (for instance, Tolmach in Optimizing ML Using a Hierarchy of Monadic Types, read especially section 6) worked out the monads corresponding to some subsets for a particular choice of effects. But I don't know of any published general principle to deduce these monads.

(I hope it's needless to say that the number of these monads grows exponentially as the overall set of effects grows.)

Hicks et al.'s paper assumes such a hierarchy is given (although not necessarily completely). They don't try to derive it. The situation can be rectified a bit, if we reuse the underlying semantic idea of Oliveira and Schrijvers's Monad Zipper: use monad transformers to describe the 'biggest' monad, and then mask the effects that are not used. But even this solution doesn't really give a principled account of how to get, say, the Writer monad from the State monad.

### That seems to be the

That seems to be the question I answered here: how to compose arbitrary monadic effects. I don't think you can hope to automatically get a state monad from a reader and writer monad, because these two operations need to pass data from one to another. That is, conceptually reader + writer is not the same as state. Reader + writer is the ability to read a value from place 1, and the ability to write a value to place 2. But you can do the opposite and get a reader and writer monad out of a state monad by hiding the put or get operation.

The situation is even worse than exponential, because order matters. We can even use the same monad multiple times. For example if we have StateT . ContT . StateT then we have two kinds of state: one kind that gets reset when we invoke a continuation and one kind that remains across continuation invocations. So there are not only exponentially many possibilities, there are even infinitely many.

A general theory of e ects and monads. As hypothesised by Moggi and as born out by practice, most computational e ects can be viewed as a monad. Does this provides the possibility to formulate a general theory of e ects and monads, avoiding the need to create a new effect system for each new effect

Is a bit different. He also wants an effect system to go along with the effect. If the monad operations are well typed, this also gives you that, in some sense. The effect system will be as granular as the type system can check the monadic operations. For example in Haskell I don't think you can specify a ST operation that only modifies certain references, so in the resulting effect system you won't be able to do that either (no magic). If you have an exception monad with multiple exception types then the Haskell type system cannot statically distinguish functions that only raise a subset of the possible exceptions, so the resulting effect system won't be able to do that either.

The papers Marriage of Moands and Effects and Optimizing ML Using Hierarchy of Monadic Types look interesting. I will try to read them.

### The opposite question

That seems to be the question I answered here: how to compose arbitrary monadic effects. I don't think you can hope to automatically get a state monad from a reader and writer monad [..]

Regarding the question Wadler asks: he poses two questions. The first one is a conjecture that the denotational semantics might take the form of T X = S1 -> X x S2, where S1 are the store of locations you only read from, and S2 is the store of locations you only write to. Note that this doesn't give you a monad: if you only ever write to a single location, T X = X x S isn't a monad (what's the unit?). But T X = X x (S + 1) is. The additional right injection says 'no state change has occurred'.

The second question is the one you quoted: can we have the same spiel happen in general? This is the underlying tone of my question --- find a principled, general account for the previous question.

Hicks et al.'s work can partially address the second question, and Oliveira and Schrijvers's can be used to partially address the first question.

Finally, what you mentioned is exactly the advantage of subtyping exceptions in Java and for region annotations (as mentioned in Wadler, for example). You have some granularity that allows you to distinguish between different types of exceptions, but rough enough to keep things decidable.

### What is the problem with

What is the problem with just using the state monad and hiding the write operation to get a reader -- or in general, just hiding the operations you don't want?

### I don't follow

State has 'get' and 'put'. If you have an algorithm that just invokes 'get', what's the problem?

### Composition and Decomposition

The issue, which I wrongly assumed was clear in context of earlier discussion, is how we achieve composition and decomposition of effects - i.e. with a few generic composition operations rather than specific code for each of 2^N or even N! integrations.

Even if the reader monad algorithm invokes 'get' (instead of 'ask' or something else), you still must lift your reader monad to run in the state-monad's environment (i.e. to propagate state, unchanged, across each operation). Can you think of any generic way to make this work?

### Processes and subtyping

You can read my post below, but my preferred solution would be to have Processes that can signal 'get' as a subtype (I'll leave my exact meaning of this term here ambiguous) of Processes that can signal 'get' or 'put'.

I think there are definitely ways to "make it work" in Haskell. It's just a question of how much boilerplate code is involved and how elaborate of a contraption you are willing to build.

### Capabilities and subsetting

Processes are not difficult to model in Haskell. Adapting composable sum types is more an issue, but it might be achievable with some sort of type-family programming.

I prefer to control most effects with capabilities - i.e. two capabilities - one each to signal 'get' and 'put' - is literally a superset of the capability to just signal 'get'. Composition and decomposition of effects is straightforward. Capabilities have the advantage of being orthogonal to the type system, while still allowing types to be effectively leveraged for proofs and safety. It also works well in open systems, and with concurrency.

Modeling capabilities in Haskell is perhaps best achieved with arrows, though Haskell's Control.Arrow models is too general for my work. I've been developing a new model of arrows to accommodate asynchronous behaviors.

### Signals as Capabilities

Modeling Processes wasn't the problem - it was embedding them in such a way that they're easy to work with to capture the flavor of programming I wanted. I'm not particularly worried about the theoretical model of Processes as it's pretty trivial.

Arrows tagging capabilities is pretty similar to the way I use signals, I think. Composition and decomposition of signals probably follows the same pattern you're using with capabilities.

Processes are much closer to the monadic approach, though. All effects occur in a linear order, which means you can suspend a Process in response to any given signal and it's in a well defined state. This makes them useful as a replacement for delimited continuations.

That said, I do see the utility of identifying Processes as subset of a more general model of computation that has multiple threads of execution. One approach I'm considering is to have such more general computations serve as implementations for Processes that just specify arbitrary interleave semantics. That way you can have Processes that can in principle (semantically) be suspended, but that support parallel implementations when possible.

### One more point

A big difference between signals and capabilities that I forgot to mention is that signals are always parameters that must be hooked up by the caller. It's not possible for a process to squirrel away a signal for later use. Similarly, revocation of signals (or swapping them out with a mock handler) is always possible and doesn't require usig a special pattern at construction time.

Monads are not very good for modeling and composing effects, in the general case.

Monads capture a lot of effects by default: ordering/time, identity, state, commitment, and unbounded time/space resource consumption. These effects interfere with modeling a single effect, but sometimes we can work around that. I posit that they strongly interfere when composing multiple effects.

I.e. we do not compose monadic effects as a set of {get,put,throw} capabilities in parallel. We compose in layers - N! possible permutations for N effects. Turns out this is also a problem for decomposition.

Can we find a better foundation for modeling effects? Something, perhaps, that exhibits:

• associativity - We can regroup expression of effects. We must explicitly model space as an effect, if grouping should matter.
• commutativity - We can rearrange expression of effects. We must explicitly model time as an effect, if relative order should be relevant.
• idempotence - We can eliminate duplicate effects. We must explicitly model identity as an effect, if replication should be meaningful.
• monotonicity - effects only add to a system. We must explicitly model time, identity, and framing to model any destructive updates.
• predicative time - We can control time resources. We must explicitly model incremental computation, if we need divergence or arbitrary time as an effect.
• predicative space - We can control memory resources. We must explicitly model incremental accumulation, when we need divergence or arbitrary space as an effect.
• determinism - we must explicitly model choice or probability as an external effect, when we need it.
• uncommitted - We can change the inputs and the effects will follow, similar to dataflow programming. We must explicitly model commitment as an effect, when we need it.
• stateless - we must explicitly model state as an external effect, if we need it. (Keeping this orthogonal to commitment is interesting.)

I believe my RDP model comes very close. I expect to manage most effects as capabilities. But I would like to see what I can do with RDP behavior transformers (e.g. to model dynamic scope) at some point.

### Composing effects

I suspect that effects strongly interfere when composing multiple effects.

It's an interesting and difficult problem though. It may be true that monads interfere in composing effects more than what is inherently unavoidable. Some have suggested that Lawvere Theories are a better option.

### Forget monads - Processes are nicer

I've mentioned a few times around here that I think an abstraction I call Processes might be more useful than monads & transformers, and this thread has motivated me to make the case in a little more detail. Feedback, particularly perceived pain points, is solicited. The basic idea is that whenever we have an effectful function, we should keep the effects parametric until the last possible moment when we hook them up / run them, and never worry about what monadic values implement the effects (as we would with a transformer stack).

As an example, here's how I would write Oleg's example. Rather than working in the list monad, we work in a Process monad that has a signal 'branch', that from the point of view of the Process returns one of the elements of a list:

f1 x = do
print x
return x

p1 = do
x <- branch [1, 2]
y <- f1 x
z <- f1 y
return z

We can provide monadic definitions that make the above valid Haskell (where the monad is a continuation type like (a ->Process) -> Process), but I prefer more sugar - picking an evaluation order and unifying effectful with non-effectful functions:

p1 = f1( f1 (branch [1, 2])) -- same as above, with sugar

This is essentially just arguing for effect typing over explicit monadic plumbing, but with support for hooking up effects to handlers. Further, it's important that the handler has access to the suspended Process signaling the effect. As I noted in the recent delimited continuations thread, this gives a mechanism that's trivially convertible to delimited continuations, but with a better (IMHO) API. But I'm heading off course here...

Getting back to the example, the point is that there is no 'p2' because the difference between Oleg's 't1' and 't2' isn't encoded in my 'p1'. The code that runs 'p1' gets to decide whether to run the continuation processes to completion (another signal) with each branch value or whether to single step through the existing possibilities in a round robin fashion (suspending when certain signals are raised). Of course in Oleg's example, interleave order was encoded in bind grouping, which is obviously undesirable and violates the monad laws. In the Process approach it's pretty clear what the implementation possibilities are -- we just need to use some other signal(s) to know when to switch between possibilities.

The last time I advanced this argument, Philippa Cowderoy noted that we can do everything in abstract monads (e.g. Branching m, IO m => m String) and achieve something very similar to what I'm advocating, and I think that's true if we don't allow first class signals. Certainly, given such an abstract value it's always possible to define instances for a Process monad and get a Process (continuation) out of it. The defining characteristic of the approach I'm advocating is that we do not provide an interpretation of the effects via type class substitution, but rather only ever handle effects via running the Process.

If we do provide first class signals, then we can't necessarily convert from a Process into an abstract monad, and I think it's a good idea to have them. For one, we immediately address Oleg's objection about the need for dynamic rewiring of effects (as it would apply to Processes). Another issue is that Haskell doesn't easily let us have multiple constraints of the same type class, but imagine that we have a class File that defines operations for interacting with a single file. Then we might want to have a type like (File_a m, File_b m => m ()) that operates on a pair of files. The monad zipper paper addresses this problem, but what if we want to write a function that manipulates an unbounded number of files? In that case, in the usual monadic approach, we'd need to define a new type class for dealing with families of files. With first class signals, the approaches can be unified.

I'm still meaning to write up more of this, including more details on Process combinators and a type system for classifying them, but I'm still of the mind that Processes are probably a more natural and useful approach to effects than would be building a bunch of custom monads. I tried to do some example code in Haskell the last time this came up, but I found it pretty cumbersome to encode the types that I'd want. So my writeup of these ideas became bottlenecked on other language issues. I maybe should take off and write up the ideas more carefully with snippets in a non-existent language to generate some feedback.

### First Class Signals

Perhaps you could explain what you mean by 'first class signals'? I've spent some time reviewing, but it seems you did not discuss those in our e-mail conversation eight months ago.

I do encourage you to write more, in a public forum. After having tried it for three months, I think doing so benefits the writer as much as the audience. But LtU tends to bury ideas, and doesn't always result in the clarity you might pursue if writing for an anonymous audience. I suggest developing a dedicated blog or wiki for your project.

As an aside: p2 = f1 =<< f1 =<< branch [1,2]. We don't need syntactic sugar to get very near what you're desiring. Indeed, desugaring works nearly as well.

### First class signals

I think it's implicit in the description of Processes we discussed which was something like:

type Process = Signal -> (Process, Signal)

(This is a coarsely typed Process where you'd have to put all of the possible signals into a big algebraic datatype.)

My intention was to contrast this with the static type class approach.

I might set up a language blog or put out some notes. I agree it would be more helpful to ask LtU after I have something more substantial to link to. Thanks for the tip. I haven't seen your blog - can you provide a link?

### first class signals

Ah, you're just referring to passing a signal's value. I was confused, I suppose, because I distinguish between the notions of signal and the values carried by one.

The coarse-grained model we came up with earlier was:

type Process x y = x -> (y,Process x y)

You posited that we really need dependent types to win the most from this model.

### Reminds me of Eff

Does this idea relate at all to Bauer and Pretnar's Eff language?

It's a bit hard to relate the two without more precise descriptions, but at least Eff has an implementation that you can play around with.

### Quite similar

Thanks for the link. I didn't investigate Eff when that story was posted, but the algebras/effect substitution are definitely similar to the signal hook-ups I have with Processes. There are some differences. For a minor example, I favor imposing an evaluation order and letting effects mix freely with pure code, leaving your type system and IDE to mark purity. There are other differences, as well, but this is definitely a similar mechanism to what I'm advocating, and I believe it has most of the advantages that I posted about above. One thing I notice from the Eff docs is this comment:

Yes, eff currently does not check types. It does not seem easy to come up with a good type system for eff.

Whereas most of the time I've spent thinking about Processes has been related to typing them.

Any idea how Eff has been received? Are there reasons why an approach like this is considered inferior to Monads? I'm curious how this relates to Lawvere theories, but I'll apparently need to catch up on some category theory before I can figure it out. Can someone answer?

### A minimally category-theoretic intro to algebraic effects...

...can be found in the LICS 2010 paper, "A Generic Operational Metatheory for Generic Effects", by Patricia Johann, Alex Simpson, and Janis Voigtlaender. Patty & co. know an awful lot of category theory, of course, but it's rather decent of them to translate for those of us who think at slightly lower levels of abstraction. :)

What I liked about this work is that it made it clear to me how monads arise as a model of effects at all. Basically, the trick in this paper is to start with an operational semantics which does not model effects -- it just has uninterpreted term formers for effectful terms, so that evaluation builds a computation tree. Then, separately, you give an interpretation function for computation trees, at which point the connection to monads becomes "obvious" -- the effectful terms form a signature and the interpretation function can be seen as an algebra for that signature.

This intuition is almost certainly in Plotkin and Power, but I never worked it out clearly enough to believe it.

That connection is very old, and dates to the early days of algebras for a monad (circa '65). The algebraic view is particularly appealing:

• The unit of the monad u : A -> TA is the inclusion of variables in the set of (normalised) terms.

• The multiplication m : TÂ² A -> T A is the evaluation of a term of terms into a single (normalised) term.
• The strength str : A x TB -> T(A x B) shuffles variables around.

### Eff

There are some differences. For a minor example, I favor imposing an evaluation order and letting effects mix freely with pure code, leaving your type system and IDE to mark purity.

I don't see how that is different than Eff, which has an evaluation order and the finished type system would check purity.

Regarding the type system, it is still under development. However, there is an experimental base type system that you can play with using the --types flag. It is still not the full system the designers have in mind, though.

As to how it's received: it seems to effect interest, although I think a fully worked out type system and a cleanly presented operational semantics (which I saw Pretnar present in the European Workshop on Computational Effects) would make it a lot easier to buy in.

Are there reasons why an approach like this is considered inferior to Monads?

I don't think I ever heard anyone claim this approach is inferior. It is still quite new, so the major criticism is that it is not fully baked yet. But the designers never claimed it was.

The relation to Lawvere theories is inherent in the construction: Monads arose as a means to implement effects. Plotkin and Power's algebraic theory of effects decomposes the description into a pair: <S, E> of effect signature S and behaviour E given by effect equations. This shift in focus from an implementation of the effects into an interface to the effects (i.e., S) gives rise to a more modular account. The language Eff lets you manipulate these signatures using effect handlers.

### Differences

One of the Eff pages talks about using do notation for explicitly sequencing effects. The code examples don't seem to use this syntax, however, so you're right that this doesn't seem to be a difference.

This shift in focus from an implementation of the effects into an interface to the effects (i.e., S) gives rise to a more modular account.

I agree with this (and in fact I was trying to make this point in my 'forget monads' post). Probably the reason I don't see the connection to Lawvere theories yet is that I don't yet grok Lawvere theories (and thanks, Neel, for the link).

Are there any linkable documents describing the type system yet? My guess is that if it supports handlers whose bodies can themselves issue and handle effects, then the only differences between what I do and what Eff does are just presentation. Those still may be interesting differences in the design space, though, so I'll still try to write up my design sometime soon.