Lambda the Ultimate

State and modularity
started 10/22/2003; 2:59:35 AM - last post 10/31/2003; 10:41:13 AM

Peter Van Roy - State and modularity

10/22/2003; 2:59:35 AM (reads: 11676, responses: 22)

State and modularity

(This was originally discussed on the Yahoo pragprog list; I think it's important enough to present it on LtU as well.)

In our experience, true state (destructively assignable entities) is essential for reasons of program modularity (which for this discussion I take as meaning: the ability to change one part of a program in a significant way without changing the rest of the program). Threaded state, e.g., monads as used by Haskell or DCGs as used by Prolog, cannot substitute for it.

Here is a simple example to show the difference between true state and threaded state. Consider a system consisting of modules A, B, C, D, and E, all written in a purely functional way. Module A calls B, which calls C, which calls D, which calls E.

Now let's say I want to instrument E, to count how many times it is performing some operation, and I want to observe this from A. If I want to program this in a purely functional way (e.g., in the pure subset of Haskell), then I am obliged to change the interfaces of all the modules A, B, C, D, and E (*). If I have true state, then I can get by with changes in just A and E. In other words, instead of changing everything on the path, I can get by with changing things on the ends. This is a major difference that has all kinds of software engineering ramifications.

Here's how it works. (See CTM for more information: section 4.7.2 in the June 5 draft or section 4.8.2 in a later draft explains the idea with a similar example.) The new module E has two interfaces, IE1 and IE2. IE1 is the original interface (purely declarative) and is used as before. IE2 is a new interface with two operations: InitCounter which initializes the internal E counter to zero and GetCounter which returns the value of the internal E counter. (The E counter is of course incremented when we call IE1.) Then all we have to do is pass an IE2 reference to module A, which can then do what it likes with the counter. Modules B, C, and D see nothing at all.

You can only do this if you have true state. Here is one way of seeing why. True state lets information pass "underground" between two interfaces, i.e., the information passes without any apparent connection between them. This is because the connection is the shared state, which is shared by the two interfaces yet hidden from the outside. The shared state is a kind of covert information channel: it lets a module pass information to other modules (or to itself in the future) without anybody else seeing it.

There is a final point regarding static typing. The usual static type systems, such as Haskell's, do not support this technique. The question is, how can type systems be extended to support this technique? I am not an expert on type systems, but I leave this question open for you. One hint may be given by the preceeding paragraph: maybe the covert information channel should be typed.

-------------------------

(*) You may counter that using monads can solve this problem in Haskell. I.e., just write all your modules in a monadic way from the start. Then there are three possibilities:

Either you do it explicitly, in which case your whole program becomes more complicated, even those parts that have nothing to do with the instrumentation. For example, what happens if there are also modules A', B', C', and D', where A' calls B', etc., and D' calls E? They all have to carry around the instrumentation counter.
Or you decide to hide the monadic operations from the programmer and let the compiler add them. In which case you have just implemented a stateful language on top of Haskell.
Or you can use unsafePerformIO. In which case you are using true state.

And I have not yet mentioned the real issue, which is declarative concurrency. Two concurrent activities execute independently. If you impose a state threading using monads, then they are no longer independent because you have to impose an order. This may seem bearable in a sequential system (such as the pure subset of Haskell) but it is simply not possible in a concurrent system (such as the declarative concurrent subset of Oz, which is pure and yet concurrent).
Posted to functional by Peter Van Roy on 10/22/03; 3:04:50 AM

Ehud Lamm - Re: State and modularity

10/22/2003; 3:43:59 AM (reads: 1349, responses: 0)

Reminds me of this. In EOPL2 (see q. 3.47, for example) the authors claim that assignment (hence state) can be used to make a program more modular. I ask my students to consider this claim in light of what they were told aout using global variables (and state) which destroy modularity.

The answer: local state is nice; global state isn't.

James Hague - Re: State and modularity

10/22/2003; 5:38:39 AM (reads: 1331, responses: 0)

I agree with Peter on this one, as I did when he started a (long!) thread on this subject a year or more ago in comp.lang.functional. This is one of those cases where the extreme lack of many types of large programs written in Haskell adds to the controversy. Is it the naive issue of people not using the pretty but academic language? Or is it because structuring many kinds of programs in Haskell is extraordinarily difficult?

Mario Blažević - Re: State and modularity

10/22/2003; 6:08:56 AM (reads: 1302, responses: 0)

This problem as stated could be easily solved by dynamically scoped implicit parameters (like those from http://lambda-the-ultimate.org/classic/messagx6433), with no need for true state whatsoever. The module E2 would invoke parameters ResetCounter and IncrementCounter, implicitly passed to it from the module A2. So there still would be no need for any change in the intermediate modules B, C, or D.

Regarding your larger issue, it looks to me that dynamic scoping of immutable values and declarative concurrency can coexist without problems. It's only with shared mutable variables that dynamic scoping becomes problematic.

Peter Van Roy - Re: State and modularity

10/22/2003; 6:39:26 AM (reads: 1285, responses: 0)

Can you elaborate on this solution? How would invoking an operation in IE1 cause the IE2 counter to be incremented? Remember that the interface IE1 must not change at all. I don't believe you can do it, but I am willing to be convinced otherwise.

Frank Atanassow - Re: State and modularity

10/22/2003; 8:07:46 AM (reads: 1263, responses: 0)

Peter, I agree with your analysis, and in fact I saw this example also in your book and found it quite thought-provoking. But, I want to add some observations, because I think a naive reader may interpret your example too easily as an argument for global state. (I guess Ehud sees the danger too.)

I am a Haskell programmer, and I agree there is no easy way to solve this problem, and frankly I think it's a failing of Haskell though not the purely functional paradigm, if you interpret it broadly enough. There is a neat solution in a more general language which avoids the pitfalls of global state and retains the advantages of what you call declarativeness, namely your second alternative:

Or you decide to hide the monadic operations from the programmer and let the compiler add them. In which case you have just implemented a stateful language on top of Haskell.

I think the way you write this it sounds like a sort of compiler hack or blue-sky solution, but it can be given a formal and practical account thus: Haskell, being a purely functional language, can be assigned a trivial monadic semantics, namely over the identity monad. To solve the instrumentation problem, interpret the modules to be instrumented over a state monad. This would require adding another module in some extended language Haskell' which, when you want to run the program as normal, interprets programs in the identity monad, and when you want to run them instrumented, interprets them in the state monad. In other words, this extended language has a feature which lets you run arbitrary code in some particular monad.

This is, really, almost the same as your first solution, namely to make all the code monadic---that is, rewrite it all using do-syntax or >> and return---except that the compiler takes care of the work.

The advantage of this versus using a conventional procedural language with global state is that you retain the ability to easily reason about the program as it will normally be used, in the identity monad, but can still instrument it modularly; indeed, by picking other monads for the semantic interpretation you could conceivably create other interesting variations; for example, if the program is a search program, you could change it's behavior from depth-first to breadth-first, or alter it to produce all results rather than the only first-encountered result. Maybe it is even the case that there is some theorem about monad transformers which lets you easily transfer theorems from the uninstrumented to the instrumented case.

I hope that was intelligible. I can make it more precise, if you like.

Peter Van Roy - Re: State and modularity

10/22/2003; 8:30:04 AM (reads: 1252, responses: 0)

It seems that you can add either concurrency or state to a functional language and keep it functional, but not both!

Peter Van Roy - Re: State and modularity

10/22/2003; 9:38:57 AM (reads: 1225, responses: 0)

How about designing a "compartmentalized" functional programming language? It would be based on a simple sequential functional core, with two extensions: one for concurrency and one for state. Program "compartments" can either use the core or one of the two extensions. Each compartment is purely functional, so their interconnection will be as well. In that way, large functional programs can be written that use concurrency and state where needed, without giving up being pure.

Just a random idea!

Patrick Logan - Re: State and modularity

10/22/2003; 10:10:19 AM (reads: 1207, responses: 0)

So in a large system such as this, how much is one going to rely on "functional reasoning" anyway?

How different is this kind of reasoning from what you would do with a well-designed imperative system? I suspect not much different.

Can you design equally bad systems, and equally good systems, with either approach? I suspect so.

I'm guessing. Can anyone not guess and still provide answers?

Vesa Karvonen - Re: State and modularity

10/22/2003; 11:00:10 AM (reads: 1186, responses: 0)

Can you design equally bad systems, and equally good systems, with either approach? I suspect so.

One thing is certain. Bad designers create bad designs in any language.

Luke Gorrie - Re: State and modularity

10/22/2003; 11:08:13 AM (reads: 1185, responses: 0)

So can good designers. :-)

Mario Blažević - Re: State and modularity

10/22/2003; 4:50:07 PM (reads: 1078, responses: 0)

module E1 ... operation:: T

module E2 ... operation:: (?callback :: ()) => T

It wouldn't be the IE2 counter, it would be a counter in A incremented whenever a callback function supplied from A to E2 was invoked. You're right that the interface IE1 would have to change, but with implicit parameters that wouldn't necessarily imply any change in B, C, or D, and that was the main requirement. Unfortunately, there is nothing like "implicit results" in the Haskell extension proposal I linked to, so the callback function would have to use unsafePerformIO to increment the counter in A.

I'd also like to point out that adding a plain counter to E is just bad design. What if the next step of instrumentation required tracking the time spent in E's operations, instead of number of their invocations? Would you recommend creating interface IE3 to do that? It makes more sense to create a single interface IE whose every operation invokes callback functions StartEvent and EndEvent, provided as parameters to the module.

Of course, standard Haskell doesn't provide anything approaching parameterizable modules nor implicit parameters. But that's the fault of Haskell, not the functional programming in general. Any of the two features would be a better solution to the stated problem than a hidden state.

My point was that dynamic scoping is an obvious feature of choice when you need something that 'lets information pass "underground" between two interfaces'. There certainly are examples of problems where private state would be preferrable to anything else, but this is not one of them.

Peter Van Roy - Re: State and modularity

10/23/2003; 12:17:58 AM (reads: 1007, responses: 0)

What if an operation in module B calls IE1 *twice*? To be correct, your solution has to do threading (pass counter output of one operation as counter input to the next operation). I don't see where the threading is happening.

You finesse the situation by calling unsafePerformIO, but this is using true state so it doesn't count. So I still don't believe you can do it, unless you can provide me with a more complete solution.

Peter Van Roy - Re: State and modularity

10/23/2003; 12:20:59 AM (reads: 1006, responses: 0)

So in a large system such as this, how much is one going to rely on "functional reasoning" anyway?

How different is this kind of reasoning from what you would do with a well-designed imperative system? I suspect not much different.

This is the point. It is the overall simplicity of the system that counts, not whether it is written in a functional or stateful style. Using true state in the right places makes the system simpler overall.

Christian Lindig - Re: State and modularity

10/23/2003; 5:26:42 AM (reads: 949, responses: 0)

It is the overall simplicity of the system that counts, not whether it is written in a functional or stateful style. Using true state in the right places makes the system simpler overall.

How about scalability? I can see that some state does not harm but indeed makes things simpler. But does this approach scale? What if you need more than just one counter. Of course, you don't find out about this requirement ahead of time but while the program evolves.

Peter Van Roy - Re: State and modularity

10/25/2003; 4:59:09 AM (reads: 759, responses: 1)

Well, you can always limit yourself to a single paradigm (e.g., functional or OOP) and your results will be as scalable as that paradigm. Using more than one paradigm can only do better.

Perhaps a more important question is, are multiple paradigms needed in a single program?. I think the answer to this question is yes, and that this is obvious when you look at any program bigger than a toy example. For example, you might want to implement an event manager (see section 7.8.5. in CTM). It takes any number of event handlers, where each handler is a pure function from Event x State -> State. The event manager handles both the concurrency of having more than one event handler and the holding of the internal state of each handler. But each handler is purely functional and sequential.

Ethan Aubin - Re: State and modularity

10/25/2003; 7:48:02 AM (reads: 742, responses: 0)

Felleisen also mentioned this example (with a similar definition of expressiveness) in On the Expressive Power of Programming Languages (page 33).

Marc Hamann - Re: State and modularity

10/25/2003; 9:28:03 AM (reads: 756, responses: 0)

your results will be as scalable as that paradigm. Using more than one paradigm can only do better.

Hmmm. Scalability in my experience comes from making the components of a system as self-contained as possible and constraining the interactions between them to the essentials.

How will this become easier if sometimes I'm using one abstraction mechanism (where it seems "paradigms" differ) and sometimes another?

Or perhaps the whole concept of "paradigm" in this context is starting to creak under the weight of semantic overload, and in fact what you are advocating is a new coherent approach that draws on elements and techniques from previous approaches?

To me this would be quite different from the mishmash of jostling features that usually goes under the name "multi-paradigm", and which essentially equates with "no paradigm".

Peter Van Roy - Re: State and modularity

10/27/2003; 3:59:00 AM (reads: 700, responses: 1)

To me this would be quite different from the mishmash of jostling features that usually goes under the name "multi-paradigm", and which essentially equates with "no paradigm".

A multiparadigm language can be well designed or poorly designed. A well designed one has a wide variety of programming concepts that are factored, which means that you can pick the ones you need and ignore the others. Depending on which ones you pick, you have different expressiveness and a different set of reasoning and programming techniques. Then a "paradigm" is just a particular set of concepts that happens to be useful for many problems.

See the Preface and Appendix D of CTM.

On the Expressive Power of Programming Languages

Thanks for the reference; I didn't know about this paper. If this example and the virtues of state are so well-known, then why is state still considered as "evil" or "awkward" or "impure" by functional programmers?

Marc Hamann - Re: State and modularity

10/27/2003; 3:35:08 PM (reads: 680, responses: 0)

Then a "paradigm" is just a particular set of concepts that happens to be useful for many problems.

By that definition, all general purpose programming languages are "multi-paradigm" ;-)

I'm starting to think that "multi-paradigm" either means experimenting with mixing "your peanut butter with my chocolate", i.e. two things that are happend to be considered incompatible for purely historical reasons, or it means nothing at all.

Kinda leaning on the latter... ;-)

Peter Van Roy - Re: State and modularity

10/28/2003; 12:31:01 AM (reads: 640, responses: 0)

By that definition, all general purpose programming languages are "multi-paradigm" ;-)

It's a question of degree. The concepts have to be factored: you can pick the ones you need without being forced to use the others. That's the key. There also have to be enough concepts that it makes a difference. It's technically correct to call C++ a multiparadigm language but in practice the multiparadigm aspect of C++ is very slim indeed!

Yes, if I understand your point, I think you're right: using concepts together that were considered incompatible for historical reasons.

In my view, the only reasonable language to use is a (well designed) multiparadigm language, since it lets you express what you want without getting in your way.

Isaac Gouy - Re: State and modularity

10/28/2003; 9:35:08 AM (reads: 632, responses: 0)

multi-paradigm
In Thomas Kuhn's usage of paradigm, multi-paradigm doesn't make a lot of sense (don't recall if it was Kristen Nygaard or Ole-Johan Dahl or Ole Lehrmann Madsen who pointed out that 'programming paradigm' could more correctly be termed 'programming style').

mixing "your peanut butter with my chocolate"
Hmmmm Postmodern?

Kirsten Chevalier - Re: State and modularity

10/31/2003; 10:41:13 AM (reads: 578, responses: 0)

Now let's say I want to instrument E, to count how many times it is performing some operation, and I want to observe this from A. If I want to program this in a purely functional way (e.g., in the pure subset of Haskell), then I am obliged to change the interfaces of all the modules A, B, C, D, and E (*). If I have true state, then I can get by with changes in just A and E. In other words, instead of changing everything on the path, I can get by with changing things on the ends.

Perhaps I'm missing something, but I don't see what the problem is here. If the state is an abstract data type, then the modules cannot depend on how the state is represented. So if we change the type that represents the state from State to (State, Counter), nothing about B, C, or D changes. If we add new operations getCounter and setCounter (of type m Counter and (Counter -> m ()), where m is the underlying monad), then A and E can use these to manipulate the counter, and the counter is simply ignored by B, C and D.