Confused Deputies in Programming Languages

There was an insightful discussion recently on the capabilities mailing list about confused deputies, and how various language constructs can inadvertently introduce confused deputies. For instance, Alan Karp devised a confused deputy innocently introduced by Java's package scoping. I then reproduced this confused deputy using OCaml's modules.

In any system with encapsulation of some sort, there is an inherent disconnect between a caller's permissions to an object, and the callee's permission to that same object. Whenever the callee has greater authority over an object it's given than a caller, and this authority is granted implicitly (by the type system, for instance), a Confused Deputy can arise. This authority augmentation is called "rights amplification" in the capability security literature.

For example, the abstract OCaml type Confused.O.t is automatically amplified to an out_channel upon calling a function in module Confused.O. This usage is safe however. The only problems arise with binary functions, where the second parameter is also implicitly amplified. One solution proposed on the list was to forbid such implicit rights amplification for all parameters except the first, thus forcing all amplifications to be explicit (cap-talk is generally object-centric, thus the emphasis on the first parameter only).

I wasn't in favour of placing this burden on all binary functions, so I proposed instead to forbid "package-scoped" functions. By this, I mean that only publicly accessible functions are visible even for modules within the same "package" (or scope/nesting/etc.). The OCaml vulnerability is only introduced because module Confused.B can access the internal Confused.O.write function, which is not publicly accessible outside of the Confused "namespace".

If this lax scoping were forbidden, the compiler writer would have to either

  1. expose the Confused.O.write function publicly, in which case its obvious that writing to the billing file is possible outside of the compiler's module,
  2. split the functionality for writing the log and the billing file into separate modules (the best solution),
  3. parameterize Confused.B with a closure to write into a Confused.O instance, effectively making it a separately wielded capability.

In all cases, the authority being wielded is more explicit, and thus accidental vulnerabilities are minimized. My proposal may complicate abstractions provided as a set of co-operating modules however, so I have three questions:

  1. How common is it to have two co-operating modules use each other's private functions?
  2. Has there been any research on this or similar topics, or does this problem resemble any others that have been studied?
  3. Can anyone think of any other possible solutions to this problem, preferably with a minimal burden on the user?

On a final note, there was also some discussion about creating confused deputies just using shared mutable state, so there is yet more evidence of the safety of purely functional programming.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

My wife and I are co-operating modules...

...and how often we use each other's private functions is nobody's business.

Illogical

Obviously, "how often [you] use each other's private functions" is at least your own business, and therefore not "nobody's business." Furthermore, it is certainly the business of your respective doctor(s), and indeed public health officials, what sorts of things you might do in private. And it's in the public's interest that medical and public health professionals behave responsibly. Thus the subject is in everyone's interest, albeit indirectly.

And if I'm not mistaken, the above is an illustration of Noetherian induction. String rewriting systems: Catch the fever!

No comments? I'm surprised.

I expected some type-wizardry to be able to prevent this innocent security vulnerability. :-)

There is...

...I just haven't had time to dig into it, and the more general question of object-capability security enforced by the type system really needs more thought. I do believe that some combination of information-flow analysis and/or proof-carrying code can solve all of the problems solved by object-capability security—at least in the presence of a sufficiently powerful type system, probably including dependent types so as to capture the object-capability security community's key insight that it is the dynamic graph of references that is important rather than the static layout of, e.g. an access-control list.

Along with making concurrency safe and commonplace, I think this is the great challenge facing the design of the Next Mainstream Programming Language™. :-)

Someone here recently

Someone here recently pointed me to this blog post on using Arrows to model information flow in capability systems. It looks quite promising.

As for my post in particular, I don't think it requires any sophisticated typing, but I was hoping someone would comment on whether my restriction to module scoping was too onerous, or perhaps point me to some more well-founded technique that can tackle the Confused Deputy problem, even if only as a side-effect.

I do believe that some combination of information-flow analysis and/or proof-carrying code can solve all of the problems solved by object-capability security

Actually, I've come to believe that capability security is subsumed by referential transparency, which doesn't require anything too sophisticated (Hindley/Milner + Monads, at least). If you maintain referential transparency, you gain all the benefits of capabilities that I can see, and then some. A key open problem is properly structuring the FFI to follow capability rules, and not introduce any ambient authorities.

Capability security and referential transparency

Actually, I've come to believe that capability security is subsumed by referential transparency

I don't understand how this can be. Capability security is possible in a language with state. Referential transparency is not. Can you explain?

State is an essential property of programming languages, for reasons of modularity (as has been discussed before on LtU). Referential transparency is ok for parts of programs but it is much too restrictive to assume it for all programs.

No contradiction

naasking: Actually, I've come to believe that capability security is subsumed by referential transparency

PVR: I don't understand how this can be. Capability security is possible in a language with state. Referential transparency is not.

I don't see the contradiction here. He is claiming that RT is a sufficient condition, not a necessary one. And I tend to agree.

RT is sufficient for capability security?

Ok, I understand the point. It may be so. But I am not convinced that RT and CS are talking about the same things. RT does not respect abstraction boundaries: it says that equals can be replaced by equals. The first "equals" might be something outside an abstraction boundary and the second "equals" might be something inside that abstraction boundary! CS above all requires to respect abstraction boundaries.

I think the idea is

I think the idea is that the whole point about something being a capability is that it has some external effect. Hence any use of a capability is inherently impure, and RT code trivially CS.

Monads do not change this either. Computing a monadic action is pure by itself. Only executing it is potentially impure. But at that point you can choose the execution method, i.e. the interpretation of the monadic primitives - in other words, you choose which capabilities you provide. The RT code itself can never do that.

Capability = designation + authority

the whole point about something being a capability is that it has some external effect

Whether the effect is external or not is orthogonal to capabilities. A capability bundles designation and authority. For example, it can designate an object and give the authority to invoke a particular method of the object. This is also orthogonal to whether the language has state or not. See section 6.4 in CTM for more explanation.

any use of a capability is inherently impure

This is false. See section 6.4 of CTM, e.g., for an example of a purely declarative object. References to such things are capabilities.

Purity = ???? ; Profit!

I think you're using different values of "purity" - but one of them has some significant degenerate cases, as you highlight.

It's a common haskeller idiom to describe monadic code as being impure even if the monad in question has a pure implementation - the EDSL described by the monad being an impure language. But not all monads describe impure languages! The identity monad belongs in everyone's list of counterexamples to generalisations about monads, for example.

What monads (and EDSLs in general) do give you is 'delimited ambient authority'.

This is the debate I was

This is the debate I was hoping for! Lot's of interesting points. :-)

A capability bundles designation and authority. For example, it can designate an object and give the authority to invoke a particular method of the object.

Memory safety is all that is required to bundle designation and authority. The problem with most languages, is that they then add additional machinery above and beyond memory safety which violate both RT and CS principles. As for the connections between RT and CS, RT provides stronger properties than CS, while losing nothing except convenience.

This is false. See section 6.4 of CTM, e.g., for an example of a purely declarative object. References to such things are capabilities.

Yes, but that's not the interesting use-case.

If an object is stateless, then identity is meaningless (like Joe-E's Selfless objects). Purely functional objects are in this class.

If an object is not effect-free, a RT language requires some way to tame this impurity, whereas CS requires only that the effect is controlled via an explicit, unforgeable reference. Regardless of the taming method used, preserving RT provides stronger properties than simply controlling the impurity via memory safety. Only in this class of objects is identity meaningful.

For almost all practical purposes, state can be tamed through data abstraction. A stateful system can be designed so that a well-defined property, called an invariant, is always true when viewed from the outside.

It's true that abstraction can help one reason about effects, but it's also true that preserving RT enforces such invariants and makes that reasoning even easier. For instance, seeing "IO a" in a function or module signature is an immediate flag that invoking this object will have some potentially global side-effect. Refining the types of monadic effects ("IO a" => "DelDir a", or "Ref a", or "SendPacket a" etc.) makes this even more fine-grained and powerful.

Preserving CS is the bare minimum though, and SML is almost there if it weren't for all the ambient authorities exposed in the Standard Basis Library.

RT does not respect abstraction boundaries: it says that equals can be replaced by equals.

I don't see the problem. From the object-centric view, a client can only interact with an object by sending messages, and if B and C both respond to the same messages, it doesn't matter if you've substituted C for B. This perfectly describes such "equals", and the situation is no different with RT.

CS is ultimately about controlling effects, and RT is all about that as well. RT just does it "better". :-)

RT is not an unqualified good

RT provides stronger properties than CS, while losing nothing except convenience

RT is not an unqualified good. It is a trade-off: one gains some properties (equals can be replaced by equals) in exchange for losing other properties (in particular, modularity). Whether or not the trade-off is a good one depends on the case at hand. There are many cases when it is not a good idea to have RT.

RT languages cannot tame true mutable state. They always lose the modularity of true mutable state: the ability to change part of a program without changing the rest. Therefore programming languages should also allow one to program without RT, if the programmer so decides.

seeing "IO a" in a function or module signature is an immediate flag that invoking this object will have some potentially global side-effect

There are two problems with this statement. (1) Global state is never needed; local state is sufficient. (2) Frequently one needs to change the state of an entity without that being visible in the signature (modularity again!). Monads don't allow that. This problem is always finessed in an RT language by stopping the running program, changing the definitions, and recompiling the program. But the system in which the RT language is implemented always has true mutable state, and for good reason.

RT is not an unqualified

RT languages cannot tame true mutable state. They always lose the modularity of true mutable state: the ability to change part of a program without changing the rest.

I don't see how this can be true. Monads are only one way to tame effects. I'm currently leaning towards uniqueness typing myself, though I'm investigating other substructural type systems as well. Monads require refactoring to add effects (a non-local change), but substructural types do not require non-local changes, unless you're propagating effects outside of the function's scope.

(1) Global state is never needed; local state is sufficient

It depends what you mean by "global". From the program's perspective, global means "anything beyond my control", ie. the outside world. Effects made in the outside world are clearly non-local. For instance, the file system, sent network packets, etc.

Frequently one needs to change the state of an entity without that being visible in the signature (modularity again!).

I don't find that convincing. The caller is making himself vulnerable by invoking the callee, so he would want to know whether the callee will actually terminate, for instance (termination is an effect). So there are many good use-cases for exposing callee properties in its type. I consider E's auditors a form of typing for instance.

So I don't think this is a case of modularity at all. If one ignores callee properties, one sacrifices safety. It's a viable trade-off, but that's not the same trade-off you're implying.

But the system in which the RT language is implemented always has true mutable state, and for good reason.

Or we're all running in a monad. :-)

[Edit: on RT being an unqualified good, I think anything that helps one to reason about a system is an unqualified good. Thus, RT is an unqualified good in my view. The means by which the user is exposed to RT is definitely not an *unqualified* good however.]

State and modularity (again)

I don't see how this can be true.

The discussion State and modularity has an example that can't be done without true mutable state.

Or we're all running in a monad. :-)

Check out "The fallacy of the preprocessor" on page 318 of CTM.

true == implicit?

That example shows you can only transparently add externally visible state to a module in a language where access to state is implicit. That's only an argument for implicit state if it's a good thing that the change is completely transparent.

That sort of change might be equally convenient but not transparent with something like effect polymorphism.

In the context of security making sure that sort of change is not transparent might help audit for information security.

SML is almost there

SML is almost there if it weren't for all the ambient authorities exposed in the Standard Basis Library

Note that SML allows one to rebind any top-level identifier, making the original binding inaccessible to the rest of the program. So, in a traditional SML system, one can relatively easily replace the entire basis library. IOW, one can easily write an SML' dialect (providing a different basis library) on top an existing SML implementation.

For example, I used this technique in a smlbot (for IRC) to make it impossible to perform IO. Basically, a prefix code, that rebinds all IO capable modules to a dummy (empty) module, is evaluated before the user program.

Traditional, interactive SML systems (e.g. Poly/ML) provide a means to save the "world" and restart from a given "world", thus making it possible to create a dialect without having to evaluate the new basis library dynamically. Some dialects also provide library definition languages beyond the basic module language. SML/NJ's CM and MLton/MLKit's MLBasis system are examples of this. I know of two examples of using the MLBasis system to create a [replacement for/extensions of] the basis library. They can be found from MLton's library repository.

I knew you could rebind

I knew you could rebind definitions, but I thought these were limited to "shadow" bindings, ie. a lexically shadowed binding, so it only applies to code under your control. If you're loading third-party SML code, doesn't it have access to the original top-level bindings?

What would be the new semantics of a rebound open_file if you wanted to forbid I/O?

If you're loading

If you're loading third-party SML code, doesn't it have access to the original top-level bindings?

Here are a couple of answers, pick your poison:

1. But you shouldn't allow one to load SML code. You should only allow one to load SML' code. I really don't see how you could load (precompiled) third-party code without the risk of it having access to some ambient authority and without imposing a priori restrictions on the third party code. IOW, I think that there is a minor contradiction in your requirements. If the SML' language has an FFI, it must be designed so that a library cannot silently obtain access to foreign functions.

2. SML, as such, doesn't provide a means to load precompiled (SML or otherwise) code (SML, the Definition, does not provide for separate compilation). Interactive SML systems often provide a (non-standard) use function or directive to evaluate SML code from a specified file. The evaluation is (usually) done with respect to the current top-level environment. (It is like you would have typed the code at the REPL.) So, in such a system, if you first rebind top-level bindings and then use some code, it cannot access the original top-level bindings.

What would be the new semantics of a rebound open_file if you wanted to forbid I/O?

None (you can select which values to leave visible in a module) or whatever you want (rebind it to whatever value you want).

For example, the SML basis provides the function print at the top-level. It has the spec:

val print : string -> unit

If you don't want the rest of program to have access to it, you can evaluate:

val print : unit = ()

In a more sophisticated system, like the MLBasis system, you can completely hide top-level bindings. IOW, the new basis doesn't have to have a top-level binding of print.

Restricting information access

Can't you use CS to restrict access to information in a RT setting?

Capability security is

Capability security is possible in a language with state. Referential transparency is not.

That's not true.. all Haskell programs are RT, yet Haskell has true mutable state. The interface to mutable state in Haskell is certainly not perfect - I don't think state should be bundled in the IO monad - but I think a better RT interface could be designed.

On whether referential transparency subsumes capability security: it's clearly not strictly true, because Haskell has RT but not CS. The IO API would need to be redesigned as naasking says.

Monads can not implement true mutable state

Monadic state does not have the modularity properties of true mutable state. This has already been discussed on LtU. See also section 4.8 of CTM.

BTW, there is nothing inherently wrong with true mutable state. It is an essential concept, and poses no particular problem when handled with care.

Overstatement?

[Mutable state] poses no particular problem when handled with care.

Isn't that quite a bit of an overstatement? Just look at the semantics, meta-theory, and reasoning techniques for languages with and without mutable state. I dare say you won't find any other PL feature that can complicate matters as profoundly. Taming it still is a very hard research problem. For example, a denotational semantics for state with first-class functions is so complicated that it's hardly useful.

Denotational semantics?

a denotational semantics for state with first-class functions is so complicated that it's hardly useful

Idem for a denotational semantics for concurrency. Doesn't that rather say something about denotational semantics?

No overstatement

For almost all practical purposes, state can be tamed through data abstraction. A stateful system can be designed so that a well-defined property, called an invariant, is always true when viewed from the outside. This makes reasoning about the system independent of reasoning about its environment. This gives us back one of the properties that makes declarative programming so attractive. (Paraphrase of part of Section 6.2.1 of CTM.)

In fact, the real problem is not state, but global state. Local state, which is encapsulated inside a data abstraction, is all we really need and it has no particular problem.

What poses problem when handled with care?

The problem with state is how can we be careful with it when it's invisible. In a language that makes effects explicit (e.g. monads, uniqueness typing, effect tracking) we know exactly what will happen if we can a function, but when state is invisible, there're no guarantees and reasoning about code becomes harder. Ditto for other effects like continuations, exceptions, space consumption. For example, the fact that (when effects are invisible) f x . g y is different than (\z -> (\w -> f x w) (g y z)) throws away many valuable properties that equational reasoning assumes. Such properties of languages with explicit state are valuable for writing correct code. If we start mixing implicit effects, suddently it's much harder to know what will happen when both occur. For example in any stateful language with effects there's no way to guarantee that when exceptions are thrown certain stateful operations are undone, it's very common in Java to have objects with broken invariants after exceptions happen, in Haskell we can use STM and be safe knowing that there are semantic guarantees related to correctness.

Calling state an essential concept just begs to know what essential is. Foundationally state isn't required, otherwise the lambda calculus wouldn't be turing complete. Your work points that it's a desirable fundamental feature for language composition, but it doesn't make it essential. The fact that it was already discussed on LtU doesn't mean that a correct understanding of this was reached, far from it there are many active research areas that consider state explicitly.

Warning: may contain heresy

Calling state an essential concept just begs to know what essential is. Foundationally state isn't required, otherwise the lambda calculus wouldn't be turing complete.

I'm going to have to side with Peter, here and I want to support that view with a two-pronged argument: a practical one and a theoretical one.

From the practical point of view, there are problems that one wants to solve with software that are simply, irriducibly, inherently stateful. For example, any garden-variety online commercial application is inherently stateful in the sense that it needs to track non-monotonic changes in real-time information.

Now you can use the technique of RT to reduce the complexity of this problem down to the bare nub of of its essential statefulness, and this is a good idea from a complexity-reduction point of view, but you can't 100% eliminate statefulness from a problem like this that has inherent statefulness.

To move to the theoretical front, I'm going to make the more controversial point: I think that the LC does contain inherent statefulness, and that, by definition, Turing-completeness is a kind of statefulness.

Potential non-termination is inherently stateful in the sense that, on any given reduction step n of some arbitrary expression E, I can't in fundamental principle tell whether I'm going to ultimately terminate in some finite number of steps, or not terminate.

Now this might not sound like state, but hopefully everyone can see how it destroys RT: if the expression terminates, it will equal to, and hence substitutable for, some family of expressions representing its value (i.e. normal form). If it doesn't, then it can only be substituted for another non-terminating expression (if you are allowing that identification).

The catch is that you may never know which answer is true, so your whole logical system is stateful in the sense that it may be one in which E has a value already at some step n, one where E has a value at some step greater than n, or it may be system where E never gets a value.

I think this is the Achilles heel of denotational semantics or other mathematical models of computation, such as CT, because in math you always assume you can zip to the end of infinity and see what "really happens", whereas, in computation, you have to accept as a foundational idea of the field that you can't do that for arbitrary cases.

I'm going to have to side

I'm going to have to side with Peter

I'm assuming you mean, you're siding with him on the necessity of state, since that seems to be the thrust of your post, and not on the "RT does not subsume capabilities" issue.

I think that the LC does contain inherent statefulness, and that, by definition, Turing-completeness is a kind of statefulness.

I think I agree. Non-termination is an effect, just like mutation, so there is a strong connection there.

Go Team Peter!

I'm assuming you mean, you're siding with him on the necessity of state, since that seems to be the thrust of your post, and not on the "RT does not subsume capabilities" issue.

You're right as far as that response goes (I was responding to Daniel's post), but I also agree with him that RT and capabilities are orthogonal concepts, even if RT may moot the latter in some circumstances. ;-)

[small edit]

I also agree with him that

I also agree with him that RT and capabilities are orthogonal concepts, even if RT may moot the latter in some circumstances. ;-)

What I'm really after, is what those remaining circumstances actually are. What properties can capabilities provide that RT does not?

E.g.

What properties can capabilities provide that RT does not?

Imaging the difference in an unreliable distributed environment. RT can't be guaranteed, since a remote system might be down, or not get a message, or have been restarted, flushing the cache, etc. In this scenario, capabilities are necessarily stateful.

Imaging the difference in an

Imaging the difference in an unreliable distributed environment. RT can't be guaranteed, since a remote system might be down, or not get a message, or have been restarted, flushing the cache, etc. In this scenario, capabilities are necessarily stateful.

Statefulness does not preclude referential transparency! Let's assume an SML with uniqueness typing:

type 'a channel{U}   (* {U} qualifies the channel as unique, so we can only read from it once *)
type 'a message = Data of 'a | Restarted | Failure | ...

(* read returns the pending message and a new unique channel for all subsequent operations *)
val read: 'a channel{U} -> 'a message * 'a channel{U}

let ch : string channel = ... in
  let (reply, ch2) = read ch in
    match reply with
        Data x ->    (* compute with x *)
      | Restarted -> (* remote service restarted *)
      | Failure ->   (* some sort of channel failure *)

The above is referentially transparent, and also capability secure, and it's properties are stronger than CS alone.

Transparent?

The above is referentially transparent, and also capability secure, and it's properties are stronger than CS alone.

I don't know; to my eye this set up is RT in the same way that sticking your fingers in your ears and saying "Nya, nya, I can't hear you" prevents you from hearing unpleasant truths. ;-)

Moreover, though this is compatible with capabilities, you don't actually show the capability implementation, and I don't see how you would get the capability behaviour for free in this instance just because this is "RT". This supports the notion that the RT part and the capability part are orthogonal (note the claim was they were orthogonal not mutually inconsistent.)

Moreover, though this is

Moreover, though this is compatible with capabilities, you don't actually show the capability implementation

I think this a very odd statement. What is a "capability implementation"? Capability security properties are satisfied by simple memory safety, so what is there to show? You can't have RT without memory safety, so by extension, RT includes capability security (for pure language objects).

Capabilities have a few other requirements, namely taming ambient authorities (impure objects external to the language/system), but these requirements involve controlling effects, which is already satisfied by preserving RT. Thus, RT subsumes capabilities. :-)

Que?

What is a "capability implementation"? Capability security properties are satisfied by simple memory safety, so what is there to show?

We seem to have lost each other somewhere.

Remember I was assuming a distributed system (which to me implies participants without shared memory). It is unclear to me how mere memory safety will solve the problem in such an environment.

Or maybe we are two people separated by the same language. ;-)

Remember I was assuming a

Remember I was assuming a distributed system (which to me implies participants without shared memory). It is unclear to me how mere memory safety will solve the problem in such an environment.

Assuming an untrusted open network, which is a maximally hostile environment, you simulate memory safety by sparseness (unguessable identifiers). That's the best you can do in such situations, RT or not.

Nobody expects the RT inquisition!

From the practical point of view, there are problems that one wants to solve with software that are simply, irriducibly, inherently stateful. For example, any garden-variety online commercial application is inherently stateful in the sense that it needs to track non-monotonic changes in real-time information. Now you can use the technique of RT to reduce the complexity of this problem down to the bare nub of of its essential statefulness, and this is a good idea from a complexity-reduction point of view, but you can't 100% eliminate statefulness from a problem like this that has inherent statefulness.

I was never arguing against state, the issue I see is with invisible state. There's a fundamental difference between causing and not causing an effect, for example mutating a piece of state, this can't be denied, so the question is how can the programmer know that a piece of code is safe to share or isn't? Without language support there's no way to be sure, we end up doing unnecessary work (e.g. defensive copies of mutable data structures) to avoid possible problems, overspecifying order of evaluation, introducing bugs. Invisible state means that every function call may cause state effects, visible state just show which ones are state dependent. There's a huge difference between f :: a -> b, f :: a -> IO b, f :: IO (a -> b) and f :: IO (a -> IO b), invisible effects blur those and treat statically as the same whereas semantically they're different.

To move to the theoretical front, I'm going to make the more controversial point: I think that the LC does contain inherent statefulness, and that, by definition, Turing-completeness is a kind of statefulness.

Potential non-termination is inherently stateful in the sense that, on any given reduction step n of some arbitrary expression E, I can't in fundamental principle tell whether I'm going to ultimately terminate in some finite number of steps, or not terminate.

You're mixing state with non-termination, which are diffent kind of effects (just to nitpick), so let's use a correct terminology and just talk about effects. As you can see I talked about effects in general, so my argument is about invisible effects. I as a programmer care a great deal if a piece of code terminates or not, but the syntatic interpretation of most languages ignore this property, so I have to test a lot (hoping to cover all the corner cases) and prove it correct (assuming that I don't commit mistakes). If the languaged tracked the distinction my job would be much easier. I'm in the camp of "Turing-complete is (for most apps) overrated, just give me data and codata!".

In theory code with effects is different from code without effects and in practice I do care about it. My daily work is in Java (doing "enterprisey" and web apps), so I have to spend time figuring out if I have to make a defensive copy of a collection before passing it as a parameter to something (because we never know if the code will change it or not), having to do code reviews to see if the code I wrote is exception safe, checking if my recursions and loop are terminating or productive. Effects are necessary, but not everywhere, and if they are invisible there's no way to tell where the effects are.

Copying collections

My daily work is in Java (doing "enterprisey" and web apps), so I have to spend time figuring out if I have to make a defensive copy of a collection before passing it as a parameter to something

I can sympathize with that. However, I would argue that the problem there is with Java's pervasive default mutability and lack of comprehensive support for immutable data. A sane imperative language like SML offers comprehensive support for immutable data. In SML, unless you really want to allow others to mutate your data, you just use immutable data. I don't think I've ever had to just copy mutable data in SML for safety like I've done in Java and C++, for example.

having to do code reviews to see if the code I wrote is exception safe, checking if my recursions and loop are terminating or productive

Note that concerns like these also apply to code written in a pure language. For example, it is easy to write code in the IO monad in Haskell that isn't exception safe (e.g. leaks handles or leaves a mutable data structure in an invalid state). Purity also doesn't automatically make recursion terminating or productive.

Actually, I just recently read a comment (not for the first time) from a paper (tech report Testing Properties of Generic Functions by Jansson and Jeuring) that they had to catch exceptions from "pure" (non-monadic) code in (Generic) Haskell. But, of course, Haskell does have effects beyond non-termination in "pure" code, so that doesn't contradict your point.

(minor edits)

We are the knights who say CTM!

I was never arguing against state, the issue I see is with invisible state.

Then perhaps we are all in agreement. An essential benefit of viewing things through the CTM hierarchy is the insight that you can have a system that, taken as a whole, is stateful (or if you prefer effectful), but that, by restricting your use of various kernel language abilities, you can help make the inherently effectful parts more visible by allowing effectful abilities only where they are needed.

You're mixing state with non-termination, which are diffent kind of effects (just to nitpick), so let's use a correct terminology and just talk about effects.

Since one good nitpick deserves another ;-), though I will grant that what you propose is a consistent terminology, and fairly standard in certain contexts, at the foundational level I was discussing, I don't think this distinction makes a difference.

This may be the source of our seeming "disagreement": we are probably focusing on different levels of abstraction.

*witty subject goes here*

Then perhaps we are all in agreement. An essential benefit of viewing things through the CTM hierarchy is the insight that you can have a system that, taken as a whole, is stateful (or if you prefer effectful), but that, by restricting your use of various kernel language abilities, you can help make the inherently effectful parts more visible by allowing effectful abilities only where they are needed.

I don't think we all are in agreement. From previous discussions with Peter van Roy (unfortunately my goggle-fu revealed nothing, shame on me) his usual exemplo is a module that provides a function (say fibonacci) and then updates the module to make it memoized without changing the callers. So it's implicit state AFAICS, which I classify as evil and unclean. No issue with CTM hierarchy thingy, it was it that made me focus on calculi instead of full languages.


[On edit] I found a piece relevant to this discussion.

Depends on the problem

then updates the module to make it memoized without changing the callers. So it's implicit state AFAICS, which I classify as evil and unclean.

I don't recall that example, but I can't really know how to evaluate the relative evilness of the solution, without knowing the problem it solves.

Let's say it is intended to be a unique key generator for some distributed app: would that be an okay use of the "evil"? (I'm not sure how else you would solve this problem without shared state...)

If you are arguing for language mechanisms that help to enforce and make visible such statefulness, I really don't think you'll get an argument about that, especially if you can come up with a mechanism to do it that everyone likes. ;-)

I really do think there is more agreement than is apparent: one side says "wisdom is using as little state as possible", the other says "state is okay, but use it wisely (i.e. as little as possible, but no less)".

To my mind the difference is merely one of emphasis and perspective. How little state you can get away with depends on the problem you are trying to solve...

doubleplusungood

Let's say it is intended to be a unique key generator for some distributed app: would that be an okay use of the "evil"? (I'm not sure how else you would solve this problem without shared state...)

Which role would such generator play? IME hiding the network is wrong, so every remote operation would be already tagged, so code like getRemoteId :: Address -> IO Id would already assume some sort of IO happening, it wouldn't matter if the remote app used state or not (as things like network failure, delays, etc., are necessary to handle). The id generation code can be as simple as a makeId :: Id -> Id function and have the ids explicitly threaded.

I really do think there is more agreement than is apparent: one side says "wisdom is using as little state as possible", the other says "state is okay, but use it wisely (i.e. as little as possible, but no less)".

I updated my post (the one you replied to) with a link to his argument. PVR's posts imply that the discussion ended and state is known to be necessary, while I disagree with that assertion and believe that the discussion is still open.

Love Big Brother

Which role would such generator play? IME hiding the network is wrong, so every remote operation would be already tagged

I'm getting the feeling that you (and maybe naasking too) are thinking about all these issues from a language-specific viewpoint. You seem to be assuming language features that, though I won't disagree are nice to have, are not the only way to do things, and certainly don't describe the boundaries of the solution space for the given problems in any fundamental way.

Solving some problems does require state, and making it visible by labelling or segregating it, however good that may be, doesn't get rid of it.

Is all of this just covert language advocacy, supporting your language(s) of choice and its way of doing things and critiquing those of PvR's language(s) of choice?

Perhaps a valid discussion to have, but not really the perspective I'm coming from in this discussion.

I updated my post (the one you replied to) with a link to his argument.

Ah yes, I remember that thread well. Interesting evidence that I haven't always seen eye to eye with PvR. ;-)

The particular problem he gives is, I think a good one: you have to instrument parts of a process without making too many non-local changes to the call chain. Think for example if you don't actually control the source for some of the steps in the chain (e.g. you are using a 3rd party library.)

Would your solution be "I can't do that; it's evil"?

I'm getting the feeling that

I'm getting the feeling that you (and maybe naasking too) are thinking about all these issues from a language-specific viewpoint.

That's probably a fair conclusion. Languages are a powerful descriptive and reasoning tool, and I think that everything can be described by some language. Ultimately then, we are interested in languages that provide the most general, most powerful, and most concise tools to describe and reason about language terms.

Solving some problems does require state, and making it visible by labelling or segregating it, however good that may be, doesn't get rid of it.

I don't think anyone's trying to get rid of state. Daniel and others are merely trying to make its presence explicit so that its effects can be reasoned about. Capabilities are useful for reasoning about the effects of programs as well, and this debate kicked off when I essentially suggested that preserving referential transparency was more effective than capabilities for such reasoning.

2 + 2 = _|_

I'm getting the feeling that you (and maybe naasking too) are thinking about all these issues from a language-specific viewpoint. You seem to be assuming language features that, though I won't disagree are nice to have, are not the only way to do things, and certainly don't describe the boundaries of the solution space for the given problems in any fundamental way.

No really, I didn't understood what was the issue about the generator. I mean if it's supposed to be used inside a particular node of a network to generate ids that must be unique within itself, if it generates ids that should be unique to a set of notes or whatever. Each case has different problems, the ones which should be internal are different than those that should be external. What I said about tagging the network interface is that f :: a -> b is a local operation and f :: a -> Net b is a remote operation, without being able to disguise a remote as a local one to achieve "network transparency". For a concrete example in the app I'm working on I generate unique ids by using SHA256 of the content I'm identifying, as I don't want to rely on guid and I need to be able to consistently obtain the same id on different nodes of my app. If I was issuing ids to a network of (possibly untrustful) nodes I would just cryptograph some tuple of information to be able to generate unguessable keys. But I don't see how those relate to the question

Solving some problems does require state, and making it visible by labelling or segregating it, however good that may be, doesn't get rid of it.

Is all of this just covert language advocacy, supporting your language(s) of choice and its way of doing things and critiquing those of PvR's language(s) of choice?

I'm not arguing that state is useful or necessary, again I'm against invisible effects. My issue is with just hand-waving this discussion as already solved. I see this discussion as an important one to have, today I'm less sure of my position that I was years ago when I argued with PVR about it on the pragprog list, IMO it's [effect tracking] an interesting PLT problem.

The particular problem he gives is, I think a good one: you have to instrument parts of a process without making too many non-local changes to the call chain. Think for example if you don't actually control the source for some of the steps in the chain (e.g. you are using a 3rd party library.)

Would your solution be "I can't do that; it's evil"?

Let's turn this problem around, if I'm the third party library author I have some contract to fulfill to my clients and a contract I expect my suppliers to hold. If my code calls a library in a way that isn't exception safe (e.g. I open a file, call the library, get some result then enter a try/finally block to use both and close the file) I depend on the contract that the library won't throw any exceptions, on what control effects the library raises. Perhaps my type system helps me keep track of what I open and close (e.g. uniqueness types), so the type system also assumes no effects. If I'm working in a machine with tight resources my language may keep track of space/time, so when I call some library I know that it'll use n words of memory and m clock ticks (or whatever), otherwise I exhaust the resources or miss deadlines: I depend on which time/space effects caused. Most of the time I don't depend on those, so I can just say that whatever effects the libraries cause I cause too. I'm looking for a solution to this where I can say how much I require from the code I use and how much I ensure so my clients know how well I play with my dependencies and how well I work. This needs to be parametric, so I can say map :: (a -> b ^ c) -> [a] -> [b] ^ [c]> (using a^c to mean that the evaluating a causes the effect c) and don't need to worry about what kind of effects it raises. In part this is similar to capabilities, because I will either create the code I need to use (therefore ensuring it causes only the effects it does) or recieve it from my client as parameters/capabilities, so I'm subjected to whatever effects are bundled with its operations. All of it using a parametric framework with some sort of subtyping-like mechanism will ensure that if I don't care about some effects I'm parametric about those (which are free theorems). With some more pseudo-code


module a(b',c)
  doA = (b' (instrumented c)).doB 
  instrumented c = ...
module b(c)
  doB n = c.doC n + c.doC n
module c
  doC n = n * 2

I believe that such kind of explicit passing of references and tracking of effects would help to improve programming. This is different from current approaches, because we let effects happen and don't sequence them using monads or unique types, but some sort of semantics extension (e.g. do (io, left to right) {...code goes here...}). But I still only have some pieces of this puzzle, I don't even know if this approach is good or necessary, but I don't think the discussion ended and invisible effects are a feature.

Today I think that the problem PVR posed is great and needs to be solved, but in a way that don't break any code. With invisible effects the code may break silently (the effects caused may harm any of the intermediate libraries or may mix in unfortunated ways with other effects, like unwind-protected and call/cc do) so I think this is not a solution but a temporary patch. Now I don't have a solution to this (but I'm working on it ;) but I recognize that there are some hard problems that need to be tackled and calling this a non-issue is harmful. OTOH as we wait for the correct solution we just put the damn counter in there and test it until we are confident we did't break anything.

Just to be clear, we need to be careful when we say "we can solve this problem with X" to be sure that we don't introduce new, harder, problems with it. I'm don't believe that any of the available solutions today are good. We need to keep the discussion going until we do have a solution. Perhaps the solution is stop trying to use classical logic with software (and have a correct/wrong binary value) and use a bayesian approach to correctness (so far the evidence adds up to 99.95% that this works), so we could ditch static type systems and use some sort of reflexive evidence gathering approach. But the discussion must go on.

Re: The discussion must go on.

while i am only slowly/barely/notreally keeping up with the thread, i'd like to encourage anybody who is trying to question assumptions.

In an ideal world...

For those of us who don't have the legacy issue to worry about, the solution is likely akin to universal dependency injection: code that's polymorphic in the environment it's run in, so that you can run it in a stateful environment with the back channels in place when you need to. In Haskell this tends to mean monads, and taking computations instead of functions as parameters where appropriate. It's already an idiom for error-handling, though partly due to a historical accident that I and many others have had cause to complain about in the past!

How can effects be both invisible and visible?

With invisible effects the code may break silently

Yet sometimes the effects have to be invisible (which is my point). How do we solve this? I think one solution is for a component (or function) to have several interfaces. One interface (the functionality) doesn't change when state is added (like the memoization example). The second interface *does* change and through it we can see how the function uses state internally.

I propose we agree to disagree

Yet sometimes the effects have to be invisible (which is my point).

I disagree with this premise. If the language doesn't support static checking of effects then we have no other choice, but the program has a different semantics depending on which effects it causes. As I wrote in another post, all these are semantically different: f :: a -> b, f :: a -> IO b, f :: IO (a -> b), f :: IO (a -> IO b), providing a common interface to them blur the semantic differences. If I call a function expecting it to have one of those semantics and it happens to be one of the others my program will have a different meaning than the one I expected. What we need is a way to signal that a code is parametric on the effects of it's suppliers so when it's module is instantiated we can bind to suppliers with the desired effects and those will just propagate to its client. In pseudo-code:


module B(C :: {Effect E; doC :: Int -[E]-> Int})
  doB :: Int -[E * E]-> Int
  doB n = C.doC n + C.doC n

Of course the current state of art doesn't have a clean solution to do this, but this don't make the problem disappear or make the current hacks become correct solutions: the invisible effects change the meaning of the program and most often than not (IME) it causes errors when the hidden effects interact with other effects. I agree that simple things like instrumentation are mostly harmless, but even things like memoization can get tricky (when we need some policy to remove unused entries and have things like weak references and friends) and if the language allows hidden effects the cat is out of the box and effects can be everywhere. That is the worst thing with such solutions, the signal (i.e. where the effects are) gets lost in a sea of noise (i.e. every function is a potential source of effects).

Sounds sensible

to me, on the face of it, fwiw. i certainly agree that plenty of bugs come from people not doing a good enough job making sure effects are done right. any thoughts on how compact / automatic the process of writing out the types/kinds/annotations for the effects could be? if it becomes a ton of new ASCII that the programmer has to write it could easily be seen as too onerous. what is the closest one can come with available systems today?

[edit: is dynamic binding in Haskell at all useful? er, i guess i was looking at that because i thought it was related to state and modularity's problem statement.]

Invisible effects only under semantic equivalence

Yet sometimes the effects have to be invisible (which is my point).

I think Daniel's point is that effects are only invisible if they do not affect the semantics of the program. Even changing the time and space behaviour of a module, as with memoization, can be an undesirable semantic change (think: languages that provide control over resources). Without being able to reason about these effects, we are reduced to statistical trial and error to reason about program behaviour.

exactly

exactly

Invisibility + reasoning: it's possible to have both

Without being able to reason about these effects, we are reduced to statistical trial and error to reason about program behaviour.

I never said that one shouldn't be able to reason about the effects. In my last reply, I propose a solution that allows one to have both the invisibility *and* to reason about the effects.

Wat baten kaars en bril als den uil niet zien en wil?

Reasoning globally

Module A depends on B which depends on C. If we add effects to C that are invisible to B how do we know that B will still function correctly? I agree that we can reason about C as a unit, but it invalidates our reasoning about B (e.g. it relied C's purity). If B is parametric on C's effects then we have no problems, but if requires that certain effects don't happen, then we have a problem. If B has effects of it's own, also hidden from A, neither A nor C are able to know if C's effects are safe when mixed with B anymore.

Why effects but not other properties?

There are plenty of ways to screw up B by changing C without changing the types of C, unless the behavior of C is completely described by its types (we already have such a description: C's code). Can you explain why you think that effects should be mentioned in the types, while other things that affect the behavior of the program should not?

.

The original problem was changing C to add some sort of effects that would be invisible from B but exposed to A via another interface. Also if the types are unchanged and no effects are added B can't be broken by any possible changes in C. That's different from saying that it's behavior won't change.

What's the difference?

What's the difference? I understand that the change of behavior could be intended. Isn't it possible that something in C is changed that violates B's assumptions? For example an assumption about the return value of a function in C (e.g. that a sort function is a stable sort), or about the time complexity. From an end-user perspective the software would be broken. Does broken have another meaning in the context of programming languages?

.

For example an assumption about the return value of a function in C (e.g. that a sort function is a stable sort), or about the time complexity.

Within a pure system the types are the allowed assumptions. If the type system is powerful enough to express a stable sort, you can only rely on it if the function provided specified it. If you rely on assumptions that the type system doesn't provide you're on your own, like depending on invisible side-effects. For example if B depends on a stable sort then either C must give some proof of it (e.g. the type system specifies the stable sort) or it must accept whatever C gives it and do the checks itself. It's important to separate what we think our program do and what's its semantics. Using the stable sort example if C provides sort :: [a] -> [a] we may think that the resulting list is sorted, but the semantic also includes the cases where sort is the identity function, the infinite possible constant functions, etc., so we can't be sure of anything other than it being a list of a.

Looking at time and space from a semantical perspective we can see those as effects too, so in a more rigorous setting we should treat them as we treat other effects (if we want to reason about them). As usually we don't care about space or time (but we can and should IMO) we don't do such analysis or just do them outside of the language.

Does broken have another meaning in the context of programming languages?

If we reason about B in isolation we can define its semantics IRT the type system and the language semantics. The ability to reason about modules in isolation is essential to software development, otherwise we can't handle the resulting complexity of whole world reasoning. If we reason about B using the interface provided by C, but this reasoning doesn't hold when we use a different (but conforming) implementation of C, then our reasoning is wrong (i.e. broken). Actually the reasoning was broken because the semantics accepts too much (e.g. in Java any piece of code can throw unchecked exceptions so it's the semantic of every program that it can either do what it should do or end with an exception, in Haskell every computation can diverge, so divergence is a valid result of all computations). There are some techniques to reason about code even when the semantics accept undesired results (e.g. this paper) but we really can do as much as the language semantics let us do.

The essence of this argument is that we can't reason about what we don't know and if the language allows unknowable things then we can't reason about anything at all, other than running the program under a controlled interpreter and seeing the results.

How much detail in the types?

So the things you can know about C is C's type. For example if C has a sort :: [a] -> [a] we cannot know that the list sort returns is sorted. You can do this with a more powerful type system, e.g. sort :: [a] -> SortedList a. But this type says nothing about its time complexity. Maybe sort just tries all permutations of [a]. sort should have type sort :: List a n -> SortedList a n, O(n log n). This still does not tell us everything about sort. To know everything we have to encode the implementation of sort in its type. This doesn't seem right to me: the reason we have modules at all is to hide complexity (so it's good that some things are unknowable). What's the right way to think about this? How much detail should the type provide? Multiple levels of detail, i.e. multiple types for different kinds of reasoning; one that says that the list is sorted, another that the sort is stable, another that it will run in O(n log n) time?

interesting question

sure, rope is useful, but one can tie oneself up in it if not careful :-)

is there wisdom on LtU about this? i've often found myself day dreaming of a system which would help you keep track of complexities for performance reasons.

Struggling with multiple levels of detail

I've been dealing with this problem myself recently. I've come to the opinion that keeping a "database" of properties in the type is the right approach.

An example for a "split" function that I wrote eventually ended up with the following types. They types were added by accretion. I never set out to put so much in the type but I kept needing different aspects or combinations of aspects for further lemmas/theorems or further strongly specified functions. I found it to be easier to throw everything in, and throw out what I didn't need later.

(* (CharIn c s) means that c is in s *) 
Fixpoint CharIn (c:ascii) (s:string) {struct s} : Prop :=
  match s with
    | EmptyString => False
    | String c' s' => c = c' \/ CharIn c s'
  end.

(* Every char in s satisfies f, or s is empty *)
Definition all_sat (f : ascii -> bool) (s : string) :=
  (s = "" \/ ~s = "" /\ forall c, CharIn c s -> f c = true).

(* (terminates f s) means that s is either empty, or the first char 
   does not satisfy f.  It is a string that as the second part,
   terminates a split. *)
Definition terminates (f : ascii -> bool) (s : string) : Prop := 
  match s with 
    | EmptyString => True 
    | String c s => match bool_dec (f c) true with 
                      | left Htrue => False
                      | right Hfalse => True
                    end
  end.

(* the splitl function *)
Definition splitl : forall (f : ascii -> bool) (s:string), 
  { p : string * string | (fst p) ++ (snd p) = s /\ all_sat f (fst p) /\ terminates f (snd p)}. 

I'm not including the actual implementation of splitl, but you can see from the type that it is specified to give a pair which when appended are the original string, every single element of the first string satisfies f and the second string is either empty, or the first element doesn't satisfy the function.

So we ended up with a conjunction of properties that are all satisfied. You are free to throw out the parts you don't care of, or throw the entire property proof away easily and just use the element by destructuring the sigma type { x: T | P x }.

That's a very nice

That's a very nice specification. Do you prove that the implementation of splitl behaves as specified or is this specification accepted as true?

It feels like the distinction between the code and the type becomes smaller as the specification becomes more precise. The specification is more and more like the code itself. You could say that the code is the most precise specification possible, but it is often not very useful as a specification. For example, it probably isn't entirely obvious that (fst p) ++ (snd p) = s holds if you read the code of splitl. Therefore it seems to me that it's sometimes desirable to have a weak specification like (fst p) ++ (snd p) = s instead of a full specification, like the code of the function. If you're trying to determine how fast a function runs it's more useful to have its time specification, like O(n), than to have its implementation, even though you could deduce the O(n) time from its implementation.

-------

Which language are you using? I am able to understand it except this part: bool_dec (f c) true, in the terminates definition.

Why isn't the match statement like this:

match f c with
| True => False
| False => True
end

-or-

not (f c)

-------

Most times you are interested in relationships between functions rather than types of values. For example if you have an implementation of pairs you are interested in:

first (pair a b) = a
second (pair a b) = b

Rather than the types of the values produced by pair, first and second. And for dictionaries you want to know that:

d = some_dict
lookup (insert (x, y) d) x = y

Can your language express this?

Back to Coq et. al.

It feels like the distinction between the code and the type becomes smaller as the specification becomes more precise. The specification is more and more like the code itself.

so once again we see that we should all be writing everything in Coq?

Strong Specification

The language of the example is Coq. The specification has a proof of correctness associated, which I didn't include because it required several lemmata which didn't seem relevant. The complete code is at:
Substring.v.

Decidable boolean equality is implemented as a library in Coq and it includes a proof of the truth of the statement, not just which computational branch to take, which makes it a stronger specification than a normal equality/disequality function would have. From the Coq library:

Lemma bool_dec : forall b1 b2 : bool, {b1 = b2} + {b1 <> b2}.

So we can in all cases decide if two booleans are the same or different. This is provably terminating and gives us power in later proofs, since we can use use the proofs that were necessary to produce this type as witnesses to more complex types later. In the case you are looking at though, I think you are correct in that it doesn't make any difference and a normal boolean would be sufficient as the proof (Htrue/Hfalse) is never used.

Monads can not implement true mutable state

Monadic state does not have the modularity properties of true mutable state. This has already been discussed on LtU. See also section 4.8 of CTM.

I don't have a copy of CTM - can you (or anyone) provide a link to the previous discussion being referenced here? Is this just a reference to the fact that you can't e.g. provide a memoizing implementation for a value of pure type?

Mutable state is always going to be an essential part of any computation machine, but it seems to me that the language we use to encode and discuss stateful behavior should be referentially transparent so that we can describe and reason about the state change (or concurrency) from within the language.

link: State and modularity

I don't have a copy of CTM - can you (or anyone) provide a link to the previous discussion being referenced here?

I believe the first such discussion (on LtU) was State and modularity. I think the subject has also come up in comments in later threads, though.

Thanks

I agree with Frank Atanassow's comment in that thread.

State is far from

State is far from inextricably bundled in the IO monad - there's just one particular form of reference that's found there, and you can almost always use STRefs instead there, as in this ghci session:

Prelude Control.Monad.ST Data.STRef> r <- stToIO $ newSTRef "foo"
Prelude Control.Monad.ST Data.STRef> stToIO $ readSTRef r
"foo"

Capabilities and RT

The above discussion is interesting, but there is at least one obvious security concern which isn't addressed well by RT (and may be hindered by it)--preventing an attacker from reading data which he is not authorized to see.

Such an attacker typically will not want to mutate anything--ideally (for them) they will leave zero traces of their presence.

Capabilities, or other means of enforcing modularity, can mitigate this sort of attacks by a) denying access to information to those who lack the appropriate credentials, and b) ensuring that (authorized) access to sensitive information is logged for subsequent audit or forensic analysis. Note that in the latter case, you want an action which is normally side-effect-free to have a side effect (a record of the access).

Capabilities are unforgeable

Capabilities are unforgeable descriptors providing access to an object. Lambda names in the lambda-calculus are also unforgeable references to objects. Thus, lambda names are capabilities, and protection from reading is achieved by withholding the lambda designating the secret, just as protection from reading is achieved by withholding the capability designating the secret.

Hopefully, my clarification below explains this sufficiently.

What is capability security and how can it be compared to RT?

I think there is some confusion in this debate, including in my own comments. Upon posting the link to this debate on cap-talk, David-Sarah Hopwood wrote back to agree with PVR, with the argument that RT does not imply encapsulation, where capability security requires it. I found this statement somewhat confusing, and it resulted in the 'aha!' moment where I realized how we are all talking past each other.

I clarified my position in a reply, and I just posted a follow-up to explain why the distinction I am making is applicable. I'll reproduce this last reply here:

Sandro Magi wrote:
> When speaking of referential transparency, I am of course speaking of it
> in terms of programming languages, and the lambda calculus at a minimum.
> A lambda is encapsulated.
>
> In Section 9.1, footnote 1 of MarkM's thesis, he defines the
> "object-capability security model" as:
>
> "Our object-capability model is essentially the untyped call-by-value
> lambda calculus with applicative-order local side effects [...]"
>
> Preserving referential transparency eliminates the dependency on
> evaluation order, while preserving all of the language's other
> properties that I can see. There might be a small hit to expressiveness
> due to more tightly controlled side-effects, but I think this is largely
> compensated for by the increased reasoning power over effects.

Upon further thought, this requires further elaboration. Progress is made when building on existing knowledge, and this process presumably results in fine-grained distinctions which are important to deep understanding of a subject.

When considering capabilities from a language perspective, there are not as many novel properties as there are from the OS perspective. As I quoted above [1], the object-capability model is effectively the lambda-calculus. The novelties of object-capabilities in this setting aren't protected, unforgeable descriptors, since the lambda calculus already has lambda names. The "reference graph = access graph" equivalency is already present.

The novelty of capabilities for programming languages is thus in *controlling access to effects* via the reference graph. From this perspective, techniques to preserve referential transparency in the lambda calculus and capability security restrictions to languages built on the lambda calculus to control "world" effects are directly comparable.

MarkM's thesis [1] distinguishes "primitives", which are internal to the language, and "devices", which are external. Most languages provide ambient access to devices resulting in all the current problems in isolating subsystems.

Both capabilities and RT techniques impose restrictions on the use of effects which affect the "world", ie. via devices. In Haskell, this distinction is not very fine-grained at the moment, as most effects simply live in the "IO" monad, but there is no reason why these effects couldn't be made more fine-grained, ie. IO => DelDir, SendPacket, etc.

When combined with types, RT is a powerful reasoning tool. RT further tends to increase modularity and reuse, which helps with code correctness, so there are compelling software engineering benefits as well.

How to derive capability security from RT

On cap-talk, Kevin Reid pointed out that confining effects via a monad like IO merely enforces a static access control policy. I think the reason for this is that Haskell is somewhat sloppy here. I'll produce an expanded version of that reply here, since it concerns language design.

If I understand it correctly, the type of Haskell's 'main':

main :: IO ()

This signature implies an naive ontology where 'main' is the sole program in all existence. Of course, this is clearly not true, as 'main' is launched by another program, which in turn was itself launched by a program, and so on and so forth. This signature for a program would not be viable if we were all running Haskell-based operating systems for instance (please correct me if I'm wrong here). Thus, 'main' executes within a pre-existing World, inducing a particular side-effect, and thus the signature should be something like:

main :: World -> IO ()

Where World is a fully abstract signature provided by the runtime. The runtime can bind FFI functions, like openFile, via World:

openFile :: World -> FilePath -> IO Handle

Since World is simply a signature, a function could conceivably build another World' conforming to the World signature, and run a new program within World'.

And thus we have achieved capability security. Relying on a single World type is less than ideal, which is why I still think designing a proper FFI is still something of an open problem, but it's sufficient for capability security.

What about other worlds?

Since World is simply a signature, a function could conceivably build another World' conforming to the World signature, and run a new program within World'.

To pick up from earlier in the thread: what if there is not one authoritative world you working with? This is the normal situation in a distributed application (which in real life most applications are).

In such a case, RT is not reliably possible across the "collection of worlds", and so capabilities are necessarily something different from RT.

In such a case, RT is not

In such a case, RT is not reliably possible across the "collection of worlds", and so capabilities are necessarily something different from RT.

There is always a single underlying World, otherwise no communication channel could exist between them. This channel between World' and World'' is built over the common World.

[Edit: I think it's clear that the underlying physical world is the canonical World, and all World'' inevitably make upcalls to their parent World' which eventually bottoms out at the physical World. Hopefully, that clarifies things.]

When is a world not a World?

There is always a single underlying World, otherwise no communication channel could exist between them. This channel between World' and World'' is built over the common World.

I think we are getting closer to the source of our different perspectives.

Normally, when we think about RT, we are implicitly considering some World which is sufficiently specified and accessible so that we can make the RT guarantee.

To continue with your terminology, normally how we do this is by defining a parent World that makes a contract with our RT world to handle all the messinesses of the physical world, and only present them locally in our world as though they too are RT. (This is one way to think about how monads provide impurity in a pure world.)

Now in a truly peer-to-peer distributed case, the parent World is almost purely conceptual, since there is no canonical representation of the shared World all in one place that can make an RT contract with its children.

And so inevitably, either your RT world has to stick its fingers in its ears and pretend it can't hear anything that isn't RT, which strongly limits how much of the parent World it can see, or it needs to be realistic and skeptical about its parent world and use a more rigorous implementation of capabilities to communicate with it than the simple RT discipline it uses internally.

I don't think I follow

I don't think I follow your argument. From my understanding, a monad can be conceptualized as encapsulating the hidden state of the World, such that all computations done within the monad simply sample the current state of the World (outside), and transition the local (inner) World'(1) to state World'(2).

Obviously there are limits to our ability to see the World, and these limits are reflected in the APIs that accept a World.

The canonical representation of World in a p2p distributed case, sufficient for our purposes at least, is the network connecting the two peers. Are you suggesting that a monadic interface, or perhaps some type and effect system, cannot adequately model this situation?

The world is a vampire

and transition the local (inner) World'(1) to state World'(2).

This again assumes that your monad system has a complete and authoritative representational version of the current World somewhere.

The canonical representation of World in a p2p distributed case, sufficient for our purposes at least, is the network connecting the two peers

And how in general does that network ensure the RT contract?

Are you suggesting that a monadic interface, or perhaps some type and effect system, cannot adequately model this situation?

For some realistic values of "this situation" and "adequately model", yes.

Which comes back to the original point: that there are situations where capabilities cannot be subsumed by RT, making them an independent concept with their own assumptions, uses and guarantees.

This again assumes that your

This again assumes that your monad system has a complete and authoritative representational version of the current World somewhere.

I don't understand. The FFI gives you as much access to the outside world as is accessible via programmable sensors and devices. Are we talking about two completely different things?

And how in general does that network ensure the RT contract?

Let's try a different tack: Haskell right now supports concurrent programming over channels via monads. How does this differ from the p2p scenario you proposed?

Nitpick

It's not main that should take a World, it's a hypothetical runIO function. The reason for this being that main really is just a designated entry point and is otherwise not special in any way whatsoever. Your scheme would (hopefully!) prevent recursive calls to main.

Confused

The signature 'World -> IO()' doesn't make much sense to me. Maybe World is just poorly named, but the whole point of returning an IO monad is to hide the world. Also, I don't think the problem you're trying to solve here actually exists. What does that signature get you that adding 'getWorld :: IO World' wouldn't?

More to the point, why have a World type at all? The runtime system that runs main is going to run it in a context - there is no requirement that the "callbacks" bound in a monad will always deterministically get the same results. That's the whole point :).

If you want to have fine grained dynamic capabilities, put them in values. File handles are a simple example of that. RT makes sure those don't leak. As always you can do things to get better static static guarantees in the types, but ultimately you're going to want a dynamic security model. ie. deleteFile :: String -> IO () cannot be checked statically - you need to know what file that string refers to at runtime.

Regarding networking, if we're talking about running code safely across several trusted machines (with trusted runtimes), then exposing values like file handles in a machine independent way seems like perfectly reasonable magic to provide under the covers. You can even provide a monadic way to ask if you're dealing with a remote object or a local one - I don't see a problem there. On the other hand, RT has very little to do with the case of networking with untrusted parties.

What does that signature get

What does that signature get you that adding 'getWorld :: IO World' wouldn't?

World is no longer an ambient authority. Your getWorld exposes an ambient authority. Maybe I didn't explain the new signature properly: 'main' with the new signature is now the entry point of the program.

File handles are a simple example of that. RT makes sure those don't leak.

And yet, why can't RT be used to ensure that authority doesn't leak? I'm placing authority to access the outside World in a value just like file handles.

As always you can do things to get better static static guarantees in the types, but ultimately you're going to want a dynamic security model. ie. deleteFile :: String -> IO () cannot be checked statically - you need to know what file that string refers to at runtime.

The point of World is that deleteFile is not invokable ambiently. See the cap-talk post I linked to above for the advantages that World brings here. In summary: World can effectively encapsulate a chroot. You hint that there may be some other way that maybe I'm not getting:

The runtime system that runs main is going to run it in a context - there is no requirement that the "callbacks" bound in a monad will always deterministically get the same results. That's the whole point :).

And where is this runtime system defined? The point of reifying World as a value, is that the runtime is now accessible from within the language itself, and can be rebound, rather than as some mysterious, ambient primitive. I tried to provide an ontological argument for why the ambient runtime doesn't make sense in an RT language, but perhaps it's not as convincing to others as it is to me. :-)

I have more to say on mutually trusted runtimes, etc. but I'll stick to the core contentious issue for now. :-)

Abandoning the World

I think I understand what you're trying to do, and I wouldn't do it with a World type. I think it distracts from the issue. The problem today is that Haskell's entry point for every program has the same type (IO ()), which grants way too much authority to most programs. Rather than trying to fix this by adding a World type (and I confess along with others, that I don't really see how that solves it), you'd like to be able to specify a more restrictive type for the program's entry point, right? Then you put the onus the OS/runtime to understand how to satisfy the obligations encoded in that type (which might include things like rights to open a network socket on a particular port, read files inside some root directory, and so on).

As you said, you can think of the problem as being in part due to the fact that the OS doesn't know anything about the program, so all entry points must have the same very general type. How would you fix this if the OS itself were a larger Haskell program? Just as above (and in the recent Swierstra pearl), you'd use a more granular IO type, not a World type.

I agree I'd like to see a

I agree I'd like to see a type reflected on main that reflects that a program uses networking and openGL, say, but there's still a level of dynamic security that you can't get this way. Like if you want to take an arbitrary IO () off the network and grant it file access, but restricted to a particular directory.

Then you put the onus the

Then you put the onus the OS/runtime to understand how to satisfy the obligations encoded in that type [...] Just as above (and in the recent Swierstra pearl), you'd use a more granular IO type, not a World type.

I agree that World is not an ideal situation. I said as much in my original post. I just don't have a better structure yet. This structure will have to satisfy secure FFIs. But I disagree with one point: using World IS putting the onus on the runtime. I'll explain, and hopefully clarify how World achieves more granular authority.

More granular monadic types only gets you so far. Let's assume we have separate types for files and network namespaces (I'm not familiar with the real Haskell types, so I'm just guessing):

type FileSpace  -- assume both are kept abstract
type IpSpace

openFile :: FileSpace -> FilePath -> Maybe IO Handle
openIpAddr :: IpSpace -> IpAddr -> Maybe IO Socket

main :: FileSpace -> IpSpace -> IO ()

This is similar to the World value, except more granular. Do you agree?

Now you want to add some new namespace for a feature that wasn't previously designed for, ie. linked in via the FFI. How would you support that if the entry point must have a well-defined structure? Here's how I sort of envision this working using World:

type World
type FileSpace

root :: World -> FileSpace   --FFI call
openFile :: FileSpace -> FilePath -> Maybe IO Handle
openDir :: FileSpace -> FilePath -> Maybe Directory
chroot :: Directory -> FileSpace

main :: World -> IO ()

In capability terms, World is a closely-held authority, ie. it is held only by a trusted core of code and only derivative rights are doled out as needed (like FileSystem). I know I used World in the original openFile definition, but this structure reflects more closely what I have in mind.

A similar structure would exist for IP addresses, and other functions that interact with the world. Anything accepting a World parameter eventually bottoms out to an FFI call. The runtime maintains a map of (function, int) pairs. The int indexes into a blob the runtime uses to store the FFI data for the module, and this blob is passed to the module's function when it calls into it.

The runtime plays the intermediary for all World calls. I would love a more granular design, I just haven't been able to come up with one just yet. I just started considering this issue though, so I'm sure there are better designs. Something involving Dynamics might be able to bind the appropriate types to an entry point with no well-defined structure.

World is no longer an

World is no longer an ambient authority. Your getWorld exposes an ambient authority. Maybe I didn't explain the new signature properly: 'main' with the new signature is now the entry point of the program.

Ok I think understand what you're proposing now - you're going to provide some function like restrictPermissions :: ... -> World -> World that lets you build new restricted worlds. I think World is a bad name for this -- something like IOPermissions would be better I think.

But IMO there's also a better way to do this. Leave everything the way it is now, and provide withPermissions :: ... -> IO () -> IO (), that runs a set of operations with a restricted set of permissions.

I think World is a bad name

I think World is a bad name for this -- something like IOPermissions would be better I think.

Perhaps. But it's not just IOPermissions, it's the visibility of anything from a program. Effectively, World totally defines the world in which the program runs. Perhaps Environment is better? Namespace might be another possibility. All IO, all namespaces, everything visible to the program, that is not defined within the program, is eventually accessed via World. World thus seems pretty appropriate to me. :-)

But I certainly welcome a more granular design, or a more appropriate name.

.Leave eve