State and modularity
started 10/22/2003; 2:59:35 AM - last post 10/31/2003; 10:41:13 AM
|
|
Peter Van Roy - State and modularity
10/22/2003; 2:59:35 AM (reads: 11676, responses: 22)
|
|
State and modularity |
(This was originally discussed on the Yahoo pragprog list; I think it's important
enough to present it on LtU as well.)
In our experience, true state (destructively assignable entities) is essential
for reasons of program modularity (which for this discussion I take as
meaning: the ability to change one part of a program in a significant way
without changing the rest of the program). Threaded state, e.g., monads
as used by Haskell or DCGs as used by Prolog, cannot substitute for it.
Here is a simple example to show the difference between true state
and threaded state.
Consider a system consisting of modules A, B, C, D, and E,
all written in a purely functional way.
Module A calls B, which calls C, which calls D, which calls E.
Now let's say I want to instrument E, to count how many times
it is performing some operation, and I want to observe this from A.
If I want to program this in a purely functional way (e.g., in the
pure subset of Haskell), then I am obliged to change the interfaces
of all the modules A, B, C, D, and E (*). If I have true state, then I can
get by with changes in just A and E. In other words, instead of
changing everything on the path, I can get by with changing
things on the ends. This is a major difference that has all kinds
of software engineering ramifications.
Here's how it works. (See CTM for more information: section 4.7.2 in
the June 5 draft or section 4.8.2 in a later draft explains the idea with a similar
example.) The new module E has two interfaces, IE1 and IE2. IE1 is the original
interface (purely declarative) and is used as before. IE2 is a new interface with two
operations: InitCounter which initializes the internal E counter to zero and GetCounter
which returns the value of the internal E counter. (The E counter is of course
incremented when we call IE1.) Then all we have to do is pass an IE2 reference to
module A, which can then do what it likes with the counter. Modules B, C, and D see nothing at all.
You can only do this if you have true state. Here is one way of seeing why. True
state lets information pass "underground" between two interfaces, i.e., the
information passes without any apparent connection between them. This is because
the connection is the shared state, which is shared by the two interfaces yet hidden
from the outside. The shared state is a kind of covert information channel: it lets a
module pass information to other modules (or to itself in the future) without anybody
else seeing it.
There is a final point regarding static typing. The usual static type systems, such
as Haskell's, do not support this technique. The question is, how can type systems
be extended to support this technique? I am not an expert on type systems, but I
leave this question open for you. One hint may be given by the preceeding
paragraph: maybe the covert information channel should be typed.
-------------------------
(*) You may counter that using monads can solve this problem in Haskell.
I.e., just write all your modules in a monadic way from the start.
Then there are three possibilities:
- Either you do it explicitly, in which case your whole program becomes more complicated, even those parts that have nothing to do with the instrumentation.
For example, what happens if there are also modules A', B', C', and D', where A' calls
B', etc., and D' calls E? They all have to carry around the instrumentation counter.
- Or you decide to hide the monadic operations from the programmer and let the
compiler add them.
In which case you have just implemented a stateful language on top of Haskell.
- Or you can use unsafePerformIO. In which case you are using true state.
And I have not yet mentioned the real issue, which is declarative concurrency. Two
concurrent activities execute independently. If you impose a state threading using
monads, then they are no longer independent because you have to impose an order.
This may seem bearable in a sequential system (such as the pure subset of Haskell)
but it is simply not possible in a concurrent system (such as the declarative
concurrent subset of Oz, which is pure and yet concurrent).
Posted to functional by Peter Van Roy on 10/22/03; 3:04:50 AM
|
|
|
|
Ehud Lamm - Re: State and modularity
10/22/2003; 3:43:59 AM (reads: 1349, responses: 0)
|
|
Reminds me of this. In EOPL2 (see q. 3.47, for example) the authors claim that assignment (hence state) can be used to make a program more modular. I ask my students to consider this claim in light of what they were told aout using global variables (and state) which destroy modularity.
The answer: local state is nice; global state isn't.
|
|
James Hague - Re: State and modularity
10/22/2003; 5:38:39 AM (reads: 1331, responses: 0)
|
|
I agree with Peter on this one, as I did when he started a (long!) thread on this subject a year or more ago in comp.lang.functional. This is one of those cases where the extreme lack of many types of large programs written in Haskell adds to the controversy. Is it the naive issue of people not using the pretty but academic language? Or is it because structuring many kinds of programs in Haskell is extraordinarily difficult?
|
|
Mario Blažević - Re: State and modularity
10/22/2003; 6:08:56 AM (reads: 1302, responses: 0)
|
|
This problem as stated could be easily solved by dynamically scoped implicit parameters (like those from http://lambda-the-ultimate.org/classic/messagx6433), with no need for true state whatsoever. The module E2 would invoke parameters ResetCounter and IncrementCounter, implicitly passed to it from the module A2. So there still would be no need for any change in the intermediate modules B, C, or D.
Regarding your larger issue, it looks to me that dynamic scoping of immutable values and declarative concurrency can coexist without problems. It's only with shared mutable variables that dynamic scoping becomes problematic.
|
|
Peter Van Roy - Re: State and modularity
10/22/2003; 6:39:26 AM (reads: 1285, responses: 0)
|
|
This problem as stated could be easily solved by dynamically scoped implicit parameters (like those from http://lambda-the-ultimate.org/classic/messagx6433), with no need for true state whatsoever. The module E2 would invoke parameters ResetCounter and IncrementCounter, implicitly passed to it from the module A2. So there still would be no need for any change in the intermediate modules B, C, or D.
Can you elaborate on this solution? How would invoking an operation in IE1
cause the IE2 counter to be incremented? Remember that the interface IE1
must not change at all. I don't believe you can do it, but I am willing to be
convinced otherwise.
|
|
Frank Atanassow - Re: State and modularity
10/22/2003; 8:07:46 AM (reads: 1263, responses: 0)
|
|
Peter, I agree with your analysis, and in fact I saw this example also in your book and found it quite thought-provoking. But, I want to add some observations, because I think a naive reader may interpret your example too easily as an argument for global state. (I guess Ehud sees the danger too.)
I am a Haskell programmer, and I agree there is no easy way to solve this problem, and frankly I think it's a failing of Haskell though not the purely functional paradigm, if you interpret it broadly enough. There is a neat solution in a more general language which avoids the pitfalls of global state and retains the advantages of what you call declarativeness, namely your second alternative:
Or you decide to hide the monadic operations from the programmer and let the compiler add them. In which case you have just implemented a stateful language on top of Haskell.
I think the way you write this it sounds like a sort of compiler hack or blue-sky solution, but it can be given a formal and practical account thus: Haskell, being a purely functional language, can be assigned a trivial monadic semantics, namely over the identity monad. To solve the instrumentation problem, interpret the modules to be instrumented over a state monad. This would require adding another module in some extended language Haskell' which, when you want to run the program as normal, interprets programs in the identity monad, and when you want to run them instrumented, interprets them in the state monad. In other words, this extended language has a feature which lets you run arbitrary code in some particular monad.
This is, really, almost the same as your first solution, namely to make all the code monadic---that is, rewrite it all using do-syntax or >> and return---except that the compiler takes care of the work.
The advantage of this versus using a conventional procedural language with global state is that you retain the ability to easily reason about the program as it will normally be used, in the identity monad, but can still instrument it modularly; indeed, by picking other monads for the semantic interpretation you could conceivably create other interesting variations; for example, if the program is a search program, you could change it's behavior from depth-first to breadth-first, or alter it to produce all results rather than the only first-encountered result. Maybe it is even the case that there is some theorem about monad transformers which lets you easily transfer theorems from the uninstrumented to the instrumented case.
I hope that was intelligible. I can make it more precise, if you like.
|
|
Peter Van Roy - Re: State and modularity
10/22/2003; 8:30:04 AM (reads: 1252, responses: 0)
|
|
It seems that you can add either concurrency or state
to a functional language and keep it functional, but not both!
|
|
Peter Van Roy - Re: State and modularity
10/22/2003; 9:38:57 AM (reads: 1225, responses: 0)
|
|
How about designing a "compartmentalized" functional
programming language? It would be based on a simple sequential
functional core, with two extensions: one for concurrency
and one for state. Program "compartments" can either
use the core or one of the two extensions. Each
compartment is purely functional, so their interconnection
will be as well. In that way, large functional programs
can be written that use concurrency and state where needed,
without giving up being pure.
Just a random idea!
|
|
Patrick Logan - Re: State and modularity
10/22/2003; 10:10:19 AM (reads: 1207, responses: 0)
|
|
So in a large system such as this, how much is one going to rely on "functional reasoning" anyway?
How different is this kind of reasoning from what you would do with a well-designed imperative system? I suspect not much different.
Can you design equally bad systems, and equally good systems, with either approach? I suspect so.
I'm guessing. Can anyone not guess and still provide answers?
|
|
Vesa Karvonen - Re: State and modularity
10/22/2003; 11:00:10 AM (reads: 1186, responses: 0)
|
|
Can you design equally bad systems, and equally good systems, with either approach? I suspect so.
One thing is certain. Bad designers create bad designs in any language.
|
|
Luke Gorrie - Re: State and modularity
10/22/2003; 11:08:13 AM (reads: 1185, responses: 0)
|
|
So can good designers. :-)
|
|
Mario Blažević - Re: State and modularity
10/22/2003; 4:50:07 PM (reads: 1078, responses: 0)
|
|
Can you elaborate on this solution? How would invoking an operation in IE1 cause the IE2 counter to be incremented? Remember that the interface IE1 must not change at all. I don't believe you can do it, but I am willing to be convinced otherwise.
module E1
...
operation:: T
module E2
...
operation:: (?callback :: ()) => T
It wouldn't be the IE2 counter, it would be a counter in A incremented whenever a callback function supplied from A to E2 was invoked. You're right that the interface IE1 would have to change, but with implicit parameters that wouldn't necessarily imply any change in B, C, or D, and that was the main requirement. Unfortunately, there is nothing like "implicit results" in the Haskell extension proposal I linked to, so the callback function would have to use unsafePerformIO to increment the counter in A.
I'd also like to point out that adding a plain counter to E is just bad design. What if the next step of instrumentation required tracking the time spent in E's operations, instead of number of their invocations? Would you recommend creating interface IE3 to do that? It makes more sense to create a single interface IE whose every operation invokes callback functions StartEvent and EndEvent, provided as parameters to the module.
Of course, standard Haskell doesn't provide anything approaching parameterizable modules nor implicit parameters. But that's the fault of Haskell, not the functional programming in general. Any of the two features would be a better solution to the stated problem than a hidden state.
My point was that dynamic scoping is an obvious feature of choice when you need something that 'lets information pass "underground" between two interfaces'. There certainly are examples of problems where private state would be preferrable to anything else, but this is not one of them.
|
|
Peter Van Roy - Re: State and modularity
10/23/2003; 12:17:58 AM (reads: 1007, responses: 0)
|
|
What if an operation in module B calls IE1 *twice*?
To be correct, your solution has to do threading
(pass counter output of one operation
as counter input to the next operation).
I don't see where the threading is happening.
You finesse the situation by calling unsafePerformIO,
but this is using true state so it doesn't count.
So I still don't believe you can do it, unless you
can provide me with a more complete solution.
|
|
Peter Van Roy - Re: State and modularity
10/23/2003; 12:20:59 AM (reads: 1006, responses: 0)
|
|
So in a large system such as this, how much is one going to rely on "functional reasoning" anyway?
How different is this kind of reasoning from what you would do with a well-designed imperative system? I suspect not much different.
This is the point. It is the overall simplicity of the system that
counts, not whether it is written in a functional or stateful style.
Using true state in the right places makes the system simpler
overall.
|
|
Christian Lindig - Re: State and modularity
10/23/2003; 5:26:42 AM (reads: 949, responses: 0)
|
|
It is the overall simplicity of the system that counts, not whether it is written in a functional or stateful style. Using true state in the right places makes the system simpler overall.
How about scalability? I can see that some state does not harm but indeed makes things simpler. But does this approach scale? What if you need more than just one counter. Of course, you don't find out about this requirement ahead of time but while the program evolves.
|
|
Peter Van Roy - Re: State and modularity
10/25/2003; 4:59:09 AM (reads: 759, responses: 1)
|
|
How about scalability? I can see that some state does not harm but indeed makes things simpler. But does this approach scale? What if you need more than just one counter. Of course, you don't find out about this requirement ahead of time but while the program evolves.
Well, you can always limit yourself to a single paradigm (e.g., functional or OOP)
and your results will be as scalable as that paradigm. Using more than one
paradigm can only do better.
Perhaps a more important question is, are multiple paradigms needed in a
single program?. I think the answer to this question is yes, and that this
is obvious when you look at any program bigger than a toy example. For example,
you might want to implement an event manager (see section 7.8.5. in CTM). It takes any number of event
handlers, where each handler is a pure function from Event x State -> State. The
event manager handles both the concurrency of having more than one event
handler and the holding of the internal state of each handler. But each handler
is purely functional and sequential.
|
|
Ethan Aubin - Re: State and modularity
10/25/2003; 7:48:02 AM (reads: 742, responses: 0)
|
|
|
Marc Hamann - Re: State and modularity
10/25/2003; 9:28:03 AM (reads: 756, responses: 0)
|
|
your results will be as scalable as that paradigm. Using more than one paradigm can only do better.
Hmmm. Scalability in my experience comes from making the components of a system as self-contained as possible and constraining the interactions between them to the essentials.
How will this become easier if sometimes I'm using one abstraction mechanism (where it seems "paradigms" differ) and sometimes another?
Or perhaps the whole concept of "paradigm" in this context is starting to creak under the weight of semantic overload, and in fact what you are advocating is a new coherent approach that draws on elements and techniques from previous approaches?
To me this would be quite different from the mishmash of jostling features that usually goes under the name "multi-paradigm", and which essentially equates with "no paradigm".
|
|
Peter Van Roy - Re: State and modularity
10/27/2003; 3:59:00 AM (reads: 700, responses: 1)
|
|
To me this would be quite different from the mishmash of jostling features that usually goes under the name "multi-paradigm", and which essentially equates with "no paradigm".
A multiparadigm language can be well designed or poorly designed.
A well designed one has a wide variety of programming
concepts that are factored, which means that you can pick the ones
you need and ignore the others.
Depending on which ones you pick, you have different
expressiveness and a different set of reasoning and programming
techniques. Then a "paradigm" is just a particular set of concepts that
happens to be useful for many problems.
See the Preface and Appendix D of CTM.
On the Expressive Power of Programming Languages
Thanks for the reference; I didn't know about this paper. If this example
and the virtues of state are so well-known, then why is state still considered
as "evil" or "awkward" or "impure" by functional programmers?
|
|
Marc Hamann - Re: State and modularity
10/27/2003; 3:35:08 PM (reads: 680, responses: 0)
|
|
Then a "paradigm" is just a particular set of concepts that happens to be useful for many problems.
By that definition, all general purpose programming languages are "multi-paradigm" ;-)
I'm starting to think that "multi-paradigm" either means experimenting with mixing "your peanut butter with my chocolate", i.e. two things that are happend to be considered incompatible for purely historical reasons, or it means nothing at all.
Kinda leaning on the latter... ;-)
|
|
Peter Van Roy - Re: State and modularity
10/28/2003; 12:31:01 AM (reads: 640, responses: 0)
|
|
By that definition, all general purpose programming languages are "multi-paradigm" ;-)
It's a question of degree.
The concepts have to be factored: you can pick the ones you need without
being forced to use the others. That's the key.
There also have to be enough concepts that it makes a difference.
It's technically correct to call C++ a multiparadigm language but
in practice the multiparadigm aspect of C++ is very slim indeed!
I'm starting to think that "multi-paradigm" either means experimenting with mixing "your peanut butter with my chocolate", i.e. two things that are happend to be considered incompatible for purely historical reasons, or it means nothing at all.
Yes, if I understand your point, I think you're right: using concepts together
that were considered incompatible for historical reasons.
In my view, the only reasonable language to use is a (well designed)
multiparadigm language, since it lets you express what you
want without getting in your way.
|
|
Isaac Gouy - Re: State and modularity
10/28/2003; 9:35:08 AM (reads: 632, responses: 0)
|
|
multi-paradigm
In Thomas Kuhn's usage of paradigm, multi-paradigm doesn't make a lot of sense (don't recall if it was Kristen Nygaard or Ole-Johan Dahl or Ole Lehrmann Madsen who pointed out that 'programming paradigm' could more correctly be termed 'programming style').
mixing "your peanut butter with my chocolate"
Hmmmm Postmodern?
|
|
Kirsten Chevalier - Re: State and modularity
10/31/2003; 10:41:13 AM (reads: 578, responses: 0)
|
|
Now let's say I want to instrument E, to count how many times it is performing some operation, and I want to observe this from A. If I want to program this in a purely functional way (e.g., in the pure subset of Haskell), then I am obliged to change the interfaces of all the modules A, B, C, D, and E (*). If I have true state, then I can get by with changes in just A and E. In other words, instead of changing everything on the path, I can get by with changing things on the ends.
Perhaps I'm missing something, but I don't see what the problem is here. If the state is an abstract data type, then the modules cannot depend on how the state is represented. So if we change the type that represents the state from State to (State, Counter), nothing about B, C, or D changes. If we add new operations getCounter and setCounter (of type m Counter and (Counter -> m ()), where m is the underlying monad), then A and E can use these to manipulate the counter, and the counter is simply ignored by B, C and D.
|
|
|
|