Capability-based security; how does it work?

Capabilities, as I understand them, are essentially keys. In the simplest model, a program presents a cap to the OS with a request to "do something", and the OS checks the cap for validity. If it's valid, (has been issued to the requestor, hasn't been revoked or expired), the OS carries out the request.

Now, these things are divisible and delegatable. If a program (like a command shell interpreter) has a cap that represents the authorities that the OS has issued to a particular user, it can create a cap that represents the user's authority over a particular file or directory or device, and pass that on to another program (like a compiler) when the user invokes a program supplying that filename, directory name, or device name (like the file to compile or the directory to create a file and write the compiled code in) as an argument. The idea is that each program should run with only the priveleges it actually needs.

Implication: programs no longer parse their own command lines, or at least not the parts of them referring to capabilities. Something to which the user's caps have been delegated has to do that, so as to provide the programs directly with the caps they need.

They are also replicable. That is, a program can communicate a cap to another program without itself giving up that cap. Thus, you can delegate the ability to run a camera or something to a remote program across the network, without giving up the ability of local programs to use that camera.

In fact, the "ambient authority" of machine code running directly on hardware is what the OS divides and replicates to provide the set of capabilities that represent the user's authority - or a particular program's authority, or etc.

The OS must know who (or what) owns a capability; when something presents a capability that was issued to something else, the request should be refused. Otherwise, the system could be breached by a program that "guessed" the binary pattern of a capability issued to something else. This means that replicating, delegating, or subdividing a capability can only be done via calls to the OS, so it can keep a current picture of who owns what.

Okay, so far so good. That's the theory as I understand it. But the theory doesn't in general match up with the security claims made for the technology, or at least those claims imply the existence of additional infrastructure, so I feel like I must be missing something.

In the first place, presenting a capability to the OS must also give the OS identity information about what's presenting the cap, so that it can check its ownership and make sure that cap actually belongs to the presenter. But every time a program starts, it has a new process ID. If the OS keeps track of the file from which the program code is loaded and assigns the compiled-in caps to the resulting process ID, then it can be defeated by putting different code into the file in order to hijack the caps.

So this seems to imply a requirement I've never seen discussed; that files with compiled-in capabilities have to be restricted such that no user whose own capabilities are not a superset of those compiled into the file can write it. IOW, if a program has the capability to write /etc/passwd then no user who does not have a capability to write /etc/passwd can be allowed to write to the program file.

A user must have all the capabilities required by any software he or she is installing or modifying. In principle each user priveleged in a certain domain can install software that has his or her priveleges in that domain. But it could easily happen that even where all the capabilities exist, no particular user would have the capabilities to install something.

So in practice, it seems like you'd always need an omnipriveleged root account to do some kinds of software installation and maintenance. Therefore it seems strange to me that the omnipriveleged root account is claimed not to be needed.

Am I missing something?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Authorization is almost

Authorization is almost never statically stored in the program text or data, and storing it in the file attributes of the program (setUID on Unix) is generally frowned on.

There are lots of ways to control authorization dynamically aside from login+fork.. As one familiar example, consider the https sessions in your web browser that you might initiate with different passwords for different websites. That method is based on cryptography, one of the points of which is making sure the binary patterns for which guessing would compromise security are too hard to guess as a practical matter. There are also systems where authentication via a local dongle or biometric gives higher credibility than a (possibly remote) password authentication.

The sudo program is another familiar capability method common on Unix systems, essentially providing a way of having the admin control temporary/dynamic effective uid and effective groups.

There is no ownership of

There is no ownership of capabilities, there is only possession. There is also no notion of identity. Given your comments elsewhere, I take it you're familiar with Scheme, so I'll point you to Rees's W7 Scheme which is a capability-secure Scheme subset.

Capabilities are essentially just unforgeable references like a memory-safe reference to a value/object. Thus, the lambda calculus (LC) is capability-secure.

The problem only arises when people attempt to extend the LC with host facilities that are not capability-secure, like the file system, which exposes ambient authority.

This means that replicating, delegating, or subdividing a capability can only be done via calls to the OS, so it can keep a current picture of who owns what.

The OS does not know "who owns what" anymore than a GC knows "who" owns "what" in a running program. This statement is generally meaningless in capability systems just as it is in language runtimes because there is no notion of identity, the "who", or ownership (you could add processes for identity and account memory to it, but this is an extension to the capability model).

If you come at capabilities using PLT terminology I think the situation will be much clearer to you. The OS case then involves adding a level of indirection or two to map language runtime services to hardware instructions and/or OS system calls.

Horton's "Who Done It?"

(you could add processes for identity and account memory to it, but this is an extension to the capability model)

You can achieve this as a pattern within the capability model, without extending it. See Delegating Responsibility in Digital Systems: Horton's "Who Done It?"

Hmm, I don't see a

Hmm, I don't see a straightforward way to unify memory accounting with the Horton protocol. I'd have to think about it. I also didn't mean "identity" in the ACL sense, but in the sense of "ownership of X bytes of allocated memory".

In any case, I suspect a capability secure approach to memory management will grow out of regions. Adding static and dynamic quotas to regions looks like a pretty straightforward extension, and that's all you need to prevent memory DoS. You add the "ExceededQuota" effect to the signature for "spawn", where you can register a handler should the exception be raised (in capability OSs, this handler is called a "keeper").

Accountability

fundamentally involves responsibility, which is what Horton describes.

I'm pursuing different approaches to resource conservation. I think Waterken IOU, focusing on trade and resource markets, is a more promising basis. Memory isn't associated with a 'region'. It's associated with a 'purse'.

How are they unforgeable?

If the OS doesn't know who owns what, then how are capabilities made unforgeable? If someone presents a bogus pattern of bits calling it a capability, isn't there a nonzero chance that the bit pattern will be the same as a bit pattern that someone would present to represent a legitimate capability, and therefore be accepted by the OS?

Okay, you could minimize the random chance by adding 64 random bits to each capability key and encrypting. But this is still an issue because in practice I don't think you can keep people from reading the bits, somewhere, in some context, and then presenting them in other contexts where the privileges they represent should not obtain.
I have in history observed Lisp Machines subverted by code that "guessed" the values of pointers to structures it should have had no access to; if the OS doesn't keep track of who owns the capabilities how can a similar forgery be prevented?

Public key crytography

One technology underlying many capability based systems is public key cryptography. The genius of pkey is that by using "one way functions" it is possible to never show the real secret part, but yet still verify that someone knows the secret part (up to an arbitrarily small chance). The current speed of computing technology for guessing secrets is used to figure out how long the keys need to be. Nowadays, the less secure ones start at about 256 bits.

Not buying the public keys....

Okay, so we can have the OS encrypt a nonce with key A and a program demonstrate knowledge of the corresponding key B by decrypting it. Repeat a couple times, and you are very sure that the program knows key B.

But that still puts key B in user space, where a user can gain knowledge of it somehow and use that knowledge to forge key B in a different context.

So I'm not buying this argument. The model it presents doesn't match the claims that have been made for it.

Public key doesn't work like that

With public key encryption, the enforcer of a capability doesn't need to use a secret key. The mechanism of a digital signature allows any entity to check that an entity possessing the secret key signed off on a specific permission.

SPKI is an example of a public key based delegation system (not particularly focused on OS security).

So I'm not buying this argument. The model it presents doesn't match the claims that have been made for it.

Can you restate which specific claim about a capability system is the one you are skeptical of?

Okay, so it's digital

Okay, so it's digital signing rather than public-key encryption. Both involve asymmetric keys, but they're different applications.

If you're checking that the holder of a particular secret key signed off on a blob, and associating a capability with the blob on the basis of that signature, then the blob is what we call a digital certificate, and the holder of the secret is the signing authority.
In this case, power to do something is associated with the signing authority.

Patterns of revocable delegation work if we allow the capability representation to get longer each time a new capability is derived; in this case there are multiple signing authorities, any one of which may render the blob invalid by dropping its secret key, and each signing authority has to handle the request as a proxy on its way back to the first signing authority (who we assume has control of the physical device or resource or whatever).

A program has to know who it got a capability from so it can send its requests to use that capability directly to the source of it. A program has to know what other programs it delegated to so it knows which key to use before sending a request onward or a result back. But the system will work without anybody else knowing.

Regardless of the system being technically possible without having the OS track identity, I would think it somewhat mad to attempt to sell anyone on a system in which the identity of a malefactor, or at least the source of a malefactor's capabilities on the local machine, could not easily be discovered.

Finally, if the OS is unaware of the chain of delegation there is a whole different security issue with the last delegee trusting all the signing authorities along the way to pass on the request verbatim and pass back the result (if any) verbatim. But perhaps that's just how life is for the last delegee.

Identity of the Malefactor

What good does having this do you, really? By the time you care about identity of a 'malefactor', security has already been violated. Thus, you aren't talking about achieving computer security. You're seeking liability on the presumption of computer insecurity.

Mechanisms for liability, auditing, and accountability have their place. Those help tackle security on the 'threat' side, by controlling incentives, rather than on the 'vulnerability' side. But, wherever possible, you should remove security vulnerabilities. Under principle of least authority. Under POLA, you grant fine-grained authorities where all possible 'uses' are legitimate and authorized. Only where POLA cannot be achieved should you resort to blame games and liability, and thus it is not necessary to have 'identity' or 'responsibility' be universal concepts in a cap system.

Anyhow, to the extent patterns for liability are necessary, they must also be secured. Capabilities offer excellent, composable ways to describe responsibility. Review the Horton document for one example. I've also briefly described some related designs on cap-talk, though I apparently didn't say anything controversial enough to provoke a response.

Revoking keys, finding "owners"

Before discussing key revocation, I want to clarify something that applies to many different posts by various people in this thread, including me.

It is sometimes tricky to figure out which comments about capability systems apply in general and which are mainly valid in the context of the security that is being enforced by an OS kernel - perhaps some would even like to reserve the use of "capability system" for OS kernel type stuff, but that's not my understanding. The "Capability Myths Demolished Paper" talks about capability systems in general and lists a set of important properties, but some of the claims only make sense in the OS kernel context - e.g. "A. No Designation Without Authority" is cited as an important feature because it simplifies implementation, but if we were talking instead about a hot swappable, distributed network system I would regard it as a bug, and similar remarks apply to "D. No Ambient Authority". To the best of my knowledge, it is conceptually easy to design an OS kernel capability system around they use of pkey, but that is not is what is efficient in practice and not the sort of capability system I had in mind when discussing pkey.

Revocation is easy within public key based systems, though it essentially parallels the way revocation works with other methods. . The original permission grantor retains the capability to "publish and modify" a list of valid public keys for a particular permission. They digitally sign that list with a different key, so various parties can check its authenticity. The signed permissions that are passed around are associated with a particular key on the list. To enable revocation, all that is required is that the permission checker must systematically check the updated list to make sure that the public key associated with a particular signing is still valid. They check the signature of the list itself to make sure the list copy they have is valid. If multiple levels of revocation are desired, then multiple levels of lists will be required. That is essentially how CA hierarchies work, except traditionally they are focused on the specific capability of establishing identity credentials. As SPKI points out, there is nothing special to the delegation logic about that particular capability.

The issue of detecting and dealing with malefactors seems like altogether a different branch of computer security. A lot of expert system type work is being done to look at, for instance, identifying temporally suspicious patterns of access requests/failures. So this issue is usually considered orthogonal in computer security. But I think your point is that if capability systems completely replaced UserIDs or other mappings from security credentials to people then there would be no way to trace back blame. That would be true, but the premise that all mappings from security credentials back to identities are necessarily removed is false. The principal of minimal privilege doesn't require that. It requires not using the same (full) identity credential for all authorizations.

About those distinctions

The purpose of that list isn't to state requirements for capability systems. It is to give a set of distinctions that can be used to characterize different sorts of capability systems.

So indeed, "no designation without authority" isn't essential, though one could make a purist argument that identity comparison is a form of authority. This is a point of philosophy over which many beers have been and will be raised. An example of a system that might use this approach is a password capability system.

But "no ambient authority" is indeed essential. In the presence of ambient authority, so many security properties break down that security is lost altogether. At which point it's no longer very interesting what the security model is.

What you describe as revocation in public key systems really isn't revocation. The old keys are not rendered invalid - it requires an active check on the part of the wielder or the validator to determine that they are invalid. This is very prone to programmer error, and it invites compromise of the checking protocol at any of several levels.

Distinctions

So indeed, "no designation without authority" isn't essential, though one could make a purist argument that identity comparison is a form of authority. This is a point of philosophy over which many beers have been and will be raised. An example of a system that might use this approach is a password capability system.

The "Capability Myths Demolished" paper talks early on about the importance of modeling dynamic changes in permission states. That is important. But the "no designation without authority" requirement is stated in a way that rules out consideration of many interesting and important dynamic properties. In order to model new facts, one generally needs both predicates/relationships and objects that those predicates apply to. But your description of "no designation without authority" seems to require a fixed set of propositions. In fact, you say in another post,

In a properly designed capability system, this is a logical impossibility. Capabilities do not "arise". They exist (logically) from the moment the machine is turned on. The problem, then, is not the existence of a capability, but the fact that some program you do not approve of has come to hold some capability that grants authority you care about.

which is basically a static way of looking at the set of permission states to be modeled. That may be appropriate for some contexts and modeling simplifications, but it was the basis for my remark that I would consider this an important defect for talking about, say, a distributed, hot swappable, system. For instance, how does your concept of capability modeling approach security issues related to use of a USB pen drive?

But "no ambient authority" is indeed essential. In the presence of ambient authority, so many security properties break down that security is lost altogether. At which point it's no longer very interesting what the security model is.

The paper's definitions of "No ambient authority", which depend on understanding the meaning of "select" or "indicate" a credential, also break down in a lot of situations, including high latency client server protocols like http/https. When we leave the single OS/process situation, I'm no longer sure what this property is actually saying. Thus my remark that it is also kernel/OS centric.

What you describe as revocation in public key systems really isn't revocation. The old keys are not rendered invalid - it requires an active check on the part of the wielder or the validator to determine that they are invalid. This is very prone to programmer error, and it invites compromise of the checking protocol at any of several levels.

The old keys are rendered invalid for the relevant permission, which is the point. There is no argument that for an OS kernel, it is more efficient to invalidate a reference in one of the classical ways - is it null or out of bounds or for an appropriate memory segment, etc. - but there is still checking, and the situation is logically parallel. For the sake of discussion, If we actually wanted to implement capabilities using PKey, then its easy to have a kernel API that implements what I described as "publish and modify", so the API could be called as the implementation of that revocation and the OS can then perform whatever internal data structure type implementation of that it deems appropriate to store that information in kernel space.

Extensibility

In a properly designed capability system, this is a logical impossibility. Capabilities do not "arise". They exist (logically) from the moment the machine is turned on..

...which is basically a static way of looking at the set of permission states to be modeled.

From this argument, it would follow that we could never stop adding features to a programming language.

What you are failing to consider, I think, is layering. There is no claim in the capability literature that capabilities are (or should be) the only layer at which policy will be enforced. Many of the criticisms of capabilities in the literature are predicated on this assumption, which has always been pretty silly.

On the contrary, the assumption going back to the Chicago Magic Number Machine is that capabilities are a substrate on which abstractions and mechanisms for enforcement will be layered. One models new facts and properties by implementing programs that provide them, and then one designs appropriate mediators for those programs.

When you get down to it, this is the essential difference between capabilities and other systems. Capabilities are designed for extensibility. They are better viewed as a set of primitives for authority management than as a static set of permissions. ACLs and most other systems do not share this extensibility property, so you critique is largely correct where these other systems are concerned.

To put this another way: capability semantics is basically that of the lambda calculas with side effects, and it is extensible in the same way. Given this, the questions of interest always seem to boil down to:

  • Can the desired policy actually be expressed? Often the answer is "no", and the reason invariably is that the policy desired is self-contradicting.
  • Given expressability, is a particular implementation efficient enough?

My private belief is that the first point is one reason for the failure of capabilities. Administrators and policy makers do not like to have the inherent contradictions of their preferred policies laid bare through mathematics. They would often prefer policies that feel good but are not actaully enforceable (and in many cases have been proven unenforceable).

The paper's definitions of "No ambient authority", which depend on understanding the meaning of "select" or "indicate" a credential, also break down in a lot of situations, including high latency client server protocols like http/https.

I don't see why you believe so. Can you expand?

What you describe as revocation in public key systems really isn't revocation. The old keys are not rendered invalid...

The old keys are rendered invalid for the relevant permission, which is the point.

No they aren't, which is the point. The keys themselves remain just as usable as they were before. The revocation relies on one side of the transaction to do a third-party check. It is easy to see that the key is not invalid: what happens when the revocation cache cannot be reached? The mere fact that this question can arise demonstrates that they key is in fact still valid.

Back to basics

In the earlier post, that you responded to, I indicated that people were writing with different concepts of capability based security in mind, and that I was not using a purely OS-based concept. What I meant by "capability based security" fits the Wikipedia definition: "A capability (known in some systems as a key) is a communicable, unforgeable token of authority. It refers to a value that references an object along with an associated set of access rights...Capability-based security refers to the principle of designing user programs such that they directly share capabilities with each other according to the principle of least privilege".

In that context, I criticized the *applicability* of the principles of "no designation without authority" and "no ambient authority" to non-OS types of capability based security analysis and to modeling distributed OS and hot swappable hardware. Your follow-ups suggest that you don't want to categorize those types of security analysis as existing at the same level as capability based security - this is exactly the terminology discord I was attempting to illustrate and clarify.

Relative to what I understand as capability based security, it certainly makes sense to talk about the capability to access a particular file, or even a set of files on a USB pen drive, and it also makes sense to to talk about the capability to access some set of web pages in a current browser session. If you would prefer to call those permissions something else - say a "Permission Unit" - my preference would be to go along with that and avoid spending any energy on a terminological dispute. I would then rephrase my earlier remark to say that "no designation without authority" is a bad property of a general system for describing "permission units" and "no ambient authority" doesn't map cleanly to "permission units" for web-based access.

Moving on to disagreements that may (or may not) be about something other than terminology...

In a properly designed capability system, this is a logical impossibility. Capabilities do not "arise". They exist (logically) from the moment the machine is turned on..

...which is basically a static way of looking at the set of permission states to be modeled.

From this argument, it would follow that we could never stop adding features to a programming language.

What you are failing to consider, I think, is layering. There is no claim in the capability literature that capabilities are (or should be) the only layer at which policy will be enforced. Many of the criticisms of capabilities in the literature are predicated on this assumption, which has always been pretty silly...
capability semantics is basically that of the lambda calculas with side effects, and it is extensible in the same way.

The lambda calculus is very general, while I understood your "no designation without authority" requirement for capabilities to say that they can be denoted by a static set of propositions without the ability to form new propositions from combinations of dynamically introducted predicates and objects. I don't really know what you are getting at above or how it is supposed to be a response to my point. I noted that it sounded like capability systems with the "no designation without authority" requirement would not be able to model new permission states like read access to a set of files on a recently plugged in USB pen drive. Perhaps you are talking about "layering" because you think I meant by that the software running on a capability based OS couldn't be somehow programmed to control access to such files??? That is certainly not what I meant or what I said; I was explicity talking about "modeling" and explicitly not restricting attention to OS/kernel primitives.

Also, what is the security analog of "adding features to a programming language"? Surely, you are not saying that dynamically adding a USB device with filesystem to your security object mapping space is analogous to adding a new feature to a programming language, but I'm not getting what you do mean.

The paper's definitions of "No ambient authority", which depend on understanding the meaning of "select" or "indicate" a credential, also break down in a lot of situations, including high latency client server protocols like http/https.

I don't see why you believe so. Can you expand?

It's irrelevant if you think that security for such situations isn't to be modeled as capability based security. But what I meant is illustrated by the following example.

You gain access to a set of web pages for a "session" of limited duration to check on the status of your credit card. As part of that process, your interactions with a server through your browser involve a bunch of different protocols including http, https/SSL/TLS, a login, and some cookies. As a result, your browser can access those web pages transparently for that session. The browser may or may not store cookies in a cache, which it may or may not clear when it closes. The browser may or may not be able to log you in automatically. The server may or may not examine various aspects of the clients profile including it's MAC address, browser version, etc. The server may or may not redirect the client to different web pages than the one it requests. If the server doesn't recognize the client's profile, it may require that you receive a nonce through your registed e-mail address and input that back in an appropriate field.

My point about the above is that the lines between what is or is not ambient authority at different stages of the particular access protocol become too blurry. In general, some ambient authority may be necessary but not sufficient for some stages of the process, whereas other ambient authority may be sufficient but not necessary for other stages.

What you describe as revocation in public key systems really isn't revocation. The old keys are not rendered invalid...

The old keys are rendered invalid for the relevant permission, which is the point.

No they aren't, which is the point. The keys themselves remain just as usable as they were before. The revocation relies on one side of the transaction to do a third-party check. It is easy to see that the key is not invalid: what happens when the revocation cache cannot be reached? The mere fact that this question can arise demonstrates that they key is in fact still valid.

Checking for revocation was described as part of the security checking process, and therefore, of course permission is denied if revocation cannot be checked. If checking for revocation sometimes fails, then that is a quality of service issue, not a security issue.

What you really want to require, I think, is that the process of revocation or changing certain sets of permissions must be atomic on the relevant time scale of interest. But what counts as atomic is relative to level - atomic is different for microcode, threads, processes, databases, etc. You elided the part of my prior post where I described an API to make pkey based revocation atomic as the OS process level if that is required in a particular setting.

With due respect to

With due respect to Wikipedia, Mark Miller and I are probably more authoritative on term definitions in this area than they are. In fact, I can't identify a single person from the capability community in the edit history for that page.

In any case, as we assembled the "Structure of Authority" paper, it became clear that several kinds of systems with very distinct properties had been labeled "capability systems", with the result that the general understanding of them had become terribly confused. In order to get the discussion back onto a useful footing, a finer-grained taxonomy became necessary. So if you're basing your critique on the Wikipedia definition, you should set it aside and start over. Your critique may still be valid, but applying it to a straw man won't help us resolve your questions.

Relative to what I understand as capability based security, it certainly makes sense to talk about the capability to access a particular file, or even a set of files on a USB pen drive, and it also makes sense to to talk about the capability to access some set of web pages in a current browser session.

Not so. A capability designates an object (emphasis on the singular). So a capability can designate the network session to an HTTP server, but it does not thereby designate the set of web pages on that server. This is getting into the distinction between authority and permission. Authority is, in effect, the transitive closure of some initial [set of] permission[s]. This is true in the same way that a PL object reference designates a particular object (permission), but may thereby grant access to a reachable object graph (authority).

When people speak of "a capability to a set of files", they are not using "capability" as a technical term. A more precise rendering would be "a capability conveying authority to a set of files". The difference being that the precise rendering acknowledges that transitively interacting behavior is involved. Of course, this level of precision is cumbersome, and people quickly revert to the less formal locution. We should just be careful to remember what it means, and to be clear among ourselves about when we are using the term "capability" informally vs. formally.

Given this clarification of terms, I think I would concede that "no designation without authority" might better (and less confusingly) have been expressed as "no designation without permission", but since authority is derived transitively from permission in capability systems, the distinction may not be important. In regard to "no ambient authority", the "ambient" refers specifically to an absence of explicit designation, and in the absence of protected designation makes it inherently hard to speak about the distinction between permission and authority, so it's not clear that there is a better term for that.

Analogy: in an ambient authority safe language, there would be objects you could access without a reference. Such a language might be memory and type safe, but it is incapable of enforcing certain security properties.

I understood your "no designation without authority" requirement for capabilities to say that they can be denoted by a static set of propositions without the ability to form new propositions from combinations of dynamically introducted predicates and objects

While there have been some examples of non-extensible capability systems, I think most of them should be viewed as incomplete, early experiments. The mechanism for extension in most capability systems lies in what (in OS capability systems) is most often called the "entry capability". This is a capability that names the application running in some process (more precisely, the "open wait" continuation of that application) as opposed to the process itself. The behavior when an entry capability is invoked entirely determined by the application, and this allows arbitrary extension in exactly the way that a procedure can build arbitrary computation in a programming language.

In PL-based capability systems, there are capability types analogous to entry capabilities, but I don't know that there is a commonly agreed term for these. It is probably convenient for the moment to refer to them as "entry capabilities", since their role and functionality is so qualitatively similar.

What is the security analog of "adding features to a programming language"? Surely, you are not saying that dynamically adding a USB device with filesystem to your security object mapping space is analogous to adding a new feature to a programming language, but I'm not getting what you do mean.

The question as framed suggests that we may be talking past each other. Capability semantics constitutes a core computational algebra for expressing permissions, not security. The purpose of the core algebra is to allow security policies to be implemented by conventional code. As a permissions algebra, the capability algebra appears to be expressively sufficient. This is true in the same way that most programming languages are Turing-complete.

The problem (I think) with the question you are framing is that you persist in a confusion of layering. It isn't the purpose of capabilities to be extended to handle USB drives. It is the purpose of capabilities (a) to act as a subtrate on which the policy and abstraction of a USB drive can be handled by higher-level code, and (b) to provide a formal operational semantics of permissions and foundational access control that can be used to determine whether the policy implemented is correct. Not all capability systems support [b]. In particular, most encryption-based capability systems do not, because there is no separation between data and permission.

In consequence, my initial response is that no extension of capabilities is required to add the notion of a USB drive. What may be required (depending on what is already present) is extension of the security-enforcing code whose policy is enforce by means of wielding (on the one hand) and restricting the propagation of (on the other) capabilities.

You gain access to a set of web pages for a "session" of limited duration to check on the status of your credit card... My point about the above is that the lines between what is or is not ambient authority at different stages of the particular access protocol become blurry.

This statement is too fuzzy to help us come to mutual understanding. Can you please try to reframe this statement in terms of the distinction between permission and authority that I outlined above?

I think that in fact you don't gain access (permissions) to web pages. What you gain access to is a communication session. The authority that is gained thereby (i.e. to the web pages) is a consequence of the dynamic composition of capability-based permissions by the web server.

There is no ambient authority, and no blurriness in this picture. Information such as the MAC address and the browser version are part of the explicit message payload, and as such, are part of the request made by the client on which the server is to operate. The fact that the server acts on the basis of the content of its request is not a consequence of ambient authority. The fact that the web server consults a database of blacklisted IP addresses is also not a consequence of ambient authority if the web server holds a capability to the database (other, non-capability designs are of couse possible, and those might or might not rely on ambient authority).

What you really want to require, I think, is that the process of revocation or changing certain sets of permissions must be atomic on the relevant time scale of interest.

Not quite. What I want is to be clear about the distinction between revocation of permission -- which should indeed be atomic in the sense that you describe, because the notion of "permission" in the capability model is atomic -- and refusal of service implemented at a higher level as a consequence of a security protocol. Refusing service based on a CRL is a fine thing to have, but it shouldn't be confused with revocation. Partly because this confuses layering, and partly because the two behaviors have qualitatively different vulnerabilities.

A concrete example of the distinction: if the capability is really invoked, the target server cannot even be accessed, because in the absence of permission to do so there is no way to send it a message at all. If the service must check for revocation, then it has received a message, and a certain tax on server-side resource has been imposed.

While the effect from the client perspective in either case is a denial of request, the semantics of the two scenarios from the server perspective are quite different. Because of this, analysis of the respective security properties of the two schemes must proceed from different starting conditions and axioms. And if you really want to know whether (and what) security is enforced, that matters.

OS level vs others levels of analysis

We still seem to be talking past each other in a lot of places.

In any case, as we assembled the "Structure of Authority" paper, it became clear that several kinds of systems with very distinct properties had been labeled "capability systems", with the result that the general understanding of them had become terribly confused. In order to get the discussion back onto a useful footing, a finer-grained taxonomy became necessary. So if you're basing your critique on the Wikipedia definition, you should set it aside and start over. Your critique may still be valid, but applying it to a straw man won't help us resolve your questions.

I'm 99% sure that I haven't critiqued any straw men, and 100% sure I didn't borrow any from Wikipedia. There are a lot of mis-readings of what I wrote in your responses, so I'm not sure which points you consider to have been part of a straw man argument. That Wikipedia page indicates that the authors, whoever they are, read a bunch of your articles including the "Capability Myths" paper, so it's a shame that you think they got it all wrong...

This sub-thread began when I pointed out that different people were talking about different things when they said "capability based" security. The two points of referencing the Wikipedia page is first that it contained a definition that fit the way I was using the terms -

"A capability is a communicable, unforgeable token of authority. It refers to a value that references an object along with an associated set of access rights. Capability-based security refers to the principle of designing user programs such that they directly share capabilities with each other according to the principle of least privilege, and to the operating system infrastructure necessary to make such transactions efficient and secure."

- and second, it illustrated that this usage has at least some minimal currency. I already indicated that I am quite happy to bow to your authority in these terminological matters by using different terminology for the meaning given above. I will henceforth call it a "permission unit" and talk about "permission unit based security".

One property of what I mean by a permission unit is that there is not necessarily any mapping between a given permission unit and a protected data structure in the kernel space of an OS. My particular "critique", if you want to call it that, can be restated as saying that "no designation without authority" is not a generically desirable property of security models where the objects of analysis are *permission units*, and it is too confusing, in general, to say whether a particular permission unit has been granted based on "ambient authority".

Relative to what I understand as capability based security, it certainly makes sense to talk about the capability to access a particular file, or even a set of files on a USB pen drive, and it also makes sense to to talk about the capability to access some set of web pages in a current browser session. If you would prefer to call those permissions something else - say a "Permission Unit" - my preference would be to go along with that and avoid spending any energy on a terminological dispute.

Not so. A capability designates an object (emphasis on the singular).

I assume you're not actually trying to comment on the coherence of my mental state (i.e. does what I understand make sense)...and also that we can also talk about something substantive beyond terminological turf wars (which I believe I did adequately side-step in the fully quoted text above).

Following up on those assumptions, I criticize the generality of a meta-theory of authorization/permission models that places strong restrictions on what the models can call objects. A file is not an indivisible unit, and people are sometimes interested in placing different permission restrictions on different parts of a file. One way to implement that is by using different encryption regimes for the different parts (alternativefilesystem implementations are also possible). If an analysis approach that is supposed to classify/categorize security regimes insists that the elementary units of what can be secured map one-to-one to kernel data strictures, then it imposes unecessary limitations on what it can easily analyze. That's clearly not a criticism of the theory as a classification of kernel level approaches to security; it's a criticism that it doesn't work well for describing models at other levels. Talking about "extensibility" or "layering" doesn't adquately address this point. Consider the analogy to hardware and software layers: material properties depend on quantum mechanics, transistor design depends on material properties, logical properties of circuits depend on electrical properties...software depends on hardware. But quantum mechanics and electrical properties aren't useful for discussing most aspects of software, even though working software is in some important sense layered on those properties.

So a capability can designate the network session to an HTTP server, but it does not thereby designate the set of web pages on that server.

I was also talking about sessions, but the kernel centric object notion of singularity doesn't seem useful here either. The authority that is the most semantically meaningful is an emergent property of sub-states of several different processes including the browser, the client OS, the server OS, a web server, probably one or more firewall OS, and maybe a few other processes. Several different permission units are involved, they don't reside only in kernel space and they aren't passed only through kernel space. Trying to claim that all thse permission units map to singular objects is hopelessly contrived for modeling at the higher level. One time passwords (OTP) are a very common component of solutions trying to implement the principal of least privilege, but they can rarely be understood as mapping to singular objects in the sense you want to require. So an OTP is a permission unit, as I define it, but it is not a capability. A security modeling language that can use permission units as primitives is going to have a much easier time describing web security setups that use OTP than one requiring capabilities as primitives. Please note that saying that is not in any way the same thing as saying that other required properties of capabilities don't matter. Where they matter, these additional properties should obviously be described as well and taken into account in any modeling and proof.

What is the security analog of "adding features to a programming language"? Surely, you are not saying that dynamically adding a USB device with filesystem to your security object mapping space is analogous to adding a new feature to a programming language, but I'm not getting what you do mean.

The question as framed suggests that we may be talking past each other. Capability semantics constitutes a core computational algebra for expressing permissions, not security. The purpose of the core algebra is to allow security policies to be implemented by conventional code. As a permissions algebra, the capability algebra appears to be expressively sufficient. This is true in the same way that most programming languages are Turing-complete.

The few points/questions regarding the above:

1. The distinction between defining a function in a given programming language and extending the language by adding a new feature seemed to get lost in there somewhere, so I'm going to assume that you would rather rephrase your earlier remark to be about defining new functions using existing ones.

2. Where is this particular core capability algebra that you speak of defined? Is there more than one precise definition that is provably mathematically equivalent ala Turing completeness?

3. Are you claiming that the expressive power of software running on a capability based OS is theoretically different from software running on current mainstream operating systems? Do programs running on Coyotos have observable security related behaviors that are not realizable by software running on other modern operating systems running on the same hardware? If so, what are they? If not, then isn't it fair to say that the distinctions you are trying to capture in the "Capability Myths" paper must be about something other than the ultimate expressive power of software that can be implemented on the given systems?

4. I already remarked that I wasn't claiming anything about the theoretical expressive power of software running on a capOS, and that my concern is with how to describe security models.

5. Turing complete (or not) is a useful way to categorize languages and the infinite sets of programs they can create, but specifying a Turing machine is usually not a good way to describe a particular program.

I understood your "no designation without authority" requirement for capabilities to say that they can be denoted by a static set of propositions without the ability to form new propositions from combinations of dynamically introducted predicates and objects

While there have been some examples of non-extensible capability systems, I think most of them should be viewed as incomplete, early experiments. The mechanism for extension in most capability systems lies in what (in OS capability systems) is most often called the "entry capability". This is a capability that names the application running in some process (more precisely, the "open wait" continuation of that application) as opposed to the process itself. The behavior when an entry capability is invoked entirely determined by the application, and this allows arbitrary extension in exactly the way that a procedure can build arbitrary computation in a programming language.

So the application implements security mechanisms and detailed understanding/modeling of what the application is doing has to happen at some other level of analysis. This corresponds to the point I have been making. Some of the distinction in the "Capability Myths" paper don't map well to that other level of analysis. You're saying, I think, that you don't intend capabilities to do the job of talking about that other level, and I'm saying, a different setup than the one in the "Myths" paper is needed for talking about other levels. At that point there isn't much of anything left for us to debate.

You gain access to a set of web pages for a "session" of limited duration to check on the status of your credit card... My point about the above is that the lines between what is or is not ambient authority at different stages of the particular access protocol become blurry.

This statement is too fuzzy to help us come to mutual understanding. Can you please try to reframe this statement in terms of the distinction between permission and authority that I outlined above?

I'm happy with your usage of "authority" as the result of dynamic composition. It works equally well for capabilities and permission units.

I think that in fact you don't gain access (permissions) to web pages. What you gain access to is a communication session. The authority that is gained thereby (i.e. to the web pages) is a consequence of the dynamic composition of capability-based permissions by the web server.

You are using "permissions" differently than I am, but otherwise so far so good.

There is no ambient authority, and no blurriness in this picture. Information such as the MAC address and the browser version are part of the explicit message payload, and as such, are part of the request made by the client on which the server is to operate. The fact that the server acts on the basis of the content of its request is not a consequence of ambient authority. The fact that the web server consults a database of blacklisted IP addresses is also not a consequence of ambient authority if the web server holds a capability to the database (other, non-capability designs are of couse possible, and those might or might not rely on ambient authority).

Let's review the definition and examples of ambient authority that were given in the paper. The definition was:

We will use the term ambient authority to describe authority that is exercised, but not selected by its user.

The examples cited in the paper of systems that do not have the "no ambient authority" according to Figure 14. are "Model 1. ACLs as columns", "Unix fs setfacl(), NT ACLs", "Model 2. capabilities as rows.", and "POSIX capabilities". Various remarks elsewhere in the paper apparently indicate that "user" corresponds to "user accounts" and emaphsize that the limitations of ACLs are related to the lack of granularity in the user account mechanism - e.g. principals could have multiple user accounts that they use for different purposes, but usually they don't.

So you are saying that the user account which a principal logs into acts as an ambient authority for permissions requested by processes created by that user, but, for instance, the MAC address of the user's machine which is typically completely static during the time the principal or her employer own a machine is not an example of ambient authority even when it is required by a security protocol because the ethernet protocol, which is built into the system software, always "selects" this credential to present?? Excuse me, but that's ridiculous. The principal does a lot more selection of a particular user account than they do of their MAC or IP address, their e-mail address (which I used in the credit card example) has similar granularity to their user account, and the collection of cookies spewed in a given POST request by their web browser may or may not be seen as "selected" to the particular permission being sought. I stand by my point by the distinctions of both what is and is not an ambient authority and whether an ambient authority is required for a given permission become too muddy to be useful when we move away from user accounts on an OS to a complex web security situation - at least at the protocol level of description. But seemingly, you don't actually intend capability based security to describe things at the level of a complicated network protocol.

What you really want to require, I think, is that the process of revocation or changing certain sets of permissions must be atomic on the relevant time scale of interest.

Not quite. What I want is to be clear about the distinction between revocation of permission -- which should indeed be atomic in the sense that you describe, because the notion of "permission" in the capability model is atomic -- and refusal of service implemented at a higher level as a consequence of a security protocol. Refusing service based on a CRL is a fine thing to have, but it shouldn't be confused with revocation. Partly because this confuses layering, and partly because the two behaviors have qualitatively different vulnerabilities.

A concrete example of the distinction: if the capability is really invoked, the target server cannot even be accessed, because in the absence of permission to do so there is no way to send it a message at all. If the service must check for revocation, then it has received a message, and a certain tax on server-side resource has been imposed.

In your example there are two machines M1, the client, and M2 the server. On M1, there is a permission unit that involves the ability to send a message to M2. On M2, there is a permission unit to accept messages from M1 at the IP level, and at least another permission unit that depends on analyzing the credentials in a message. Conceptually it is easy to see that any of those three different permissions could be revoked and revoking any one of them would cause a client request from M1 to fail. For some concerns it will indeed be relevant to be explicit about which subset was revoked. For other concerns both M1 and M2 might be part of the implementation of a single logical database and the distinction is hidden and irrelevant implementation detail...We keep coming around to the same point that you want to insist that "capability based" talk not abstract away from the single OS/kernel boundary, and I say "Fine, but security analysis involving permission units is often more appropriate at other levels of abstraction and implementation".

Where is this particular

Where is this particular core capability algebra that you speak of defined?

shap is most likely referring to his diminish-take model. Alternately, as he stated elsewhere in this thread, the most basic pervasive capability system is the lambda calculus with mutation.

Are you claiming that the expressive power of software running on a capability based OS is theoretically different from software running on current mainstream operating systems?

The expressive power of the security properties, yes. For instance, you cannot express and enforce confinement in mainstream operating systems. To express confinement you would have to implement an interprter/virtual machine and run all software in it.

Further, mainstream OSes allow you to express properties that cannot in principle be enforced, where capabilities do not. Capabilities seem much more aligned with enforceability than other access control models, at least those in current OSes.

I stand by my point by the distinctions of both what is and is not an ambient authority and whether an ambient authority is required for a given permission become too muddy to be useful when we move away from user accounts on an OS to a complex web security situation

It's not muddy at all. Taking the lambda calculus model of capabilities, ambient authority is more or less defined as: any top-level bindings via which one can induce side-effects (ie. mutation, I/O, etc.). In other words, if a program induces side-effects via its environment instead of via its given parameters, then it is exercising ambient authority, or as the paper put it, "authority that is exercised, but not selected by its user."

It should be straightforward to map any given situation to this model and conclude whether an authority is ambient.

Very well put

Taking the lambda calculus model of capabilities, ambient authority is more or less defined as: any top-level bindings via which one can induce side-effects (ie. mutation, I/O, etc.).

This is exactly correct. To state it another way: evaluation of [terminating] terms does not entail the wielding of permission. Such expressions are effectively constant. It is when terms cease to be term-substitutable that issues of identity and permission must be considered. The [transitive] introduction of state and/or the possibility of non-termination both cause this to happen.

misc

Are you claiming that the expressive power of software running on a capability based OS is theoretically different from software running on current mainstream operating systems?

The expressive power of the security properties, yes. For instance, you cannot express and enforce confinement in mainstream operating systems. To express confinement you would have to implement an interprter/virtual machine and run all software in it.

In the context where the question about expressive power was posed, I believe shap was implicitly using a definition of 'equivalent expressive power' in the opposite way than you do above. He was saying that a given capOS *did* have the expressive power to to enforce previously unanticipated authority regimes - e.g. on the USB pen drive - because it could host a software program that implemented the desired policies.

But thanks very much for the illustrative example. I have a couple of followup questions about it:

1) You only mentioned sandbox/jail based on virtual machines, but it would be technically possible to implement the same policies with alternative libc based sandbox/jail techniques on a modern OS with memory protection (i.e. a malicious application can't rewrite or intercept the libc routines), would it not?

2) desiring total confinement for a process is an unusual case - what do you think are the best examples in practice of particular confinement properties that would be generally desired by are not implemented on conventional OS because it is too inconvenient to do so?

Further, mainstream OSes allow you to express properties that cannot in principle be enforced, where capabilities do not.

Can you give a canonical example of what you have in mind?

I stand by my point by the distinctions of both what is and is not an ambient authority and whether an ambient authority is required for a given permission become too muddy to be useful when we move away from user accounts on an OS to a complex web security situation

It's not muddy at all. Taking the lambda calculus model of capabilities, ambient authority is more or less defined as: any top-level bindings via which one can induce side-effects (ie. mutation, I/O, etc.). In other words, if a program induces side-effects via its environment instead of via its given parameters, then it is exercising ambient authority, or as the paper put it, "authority that is exercised, but not selected by its user."

It should be straightforward to map any given situation to this model and conclude whether an authority is ambient.

I already understood "ambient authority" in the way you describe above, and I've already expressed my disagreement with your last statement above. My argument is precisely that it is not straightforward to map some security analysis situations to that model. The example I gave involved a complicated web access protocol.

You only mentioned

You only mentioned sandbox/jail based on virtual machines, but it would be technically possible to implement the same policies with alternative libc based sandbox/jail techniques on a modern OS with memory protection (i.e. a malicious application can't rewrite or intercept the libc routines), would it not?

See Plash for what can be done. See also the other post for the sorts of restrictions that are required, and also Mark Miller's caveat about equivalency.

desiring total confinement for a process is an unusual case - what do you think are the best examples in practice of particular confinement properties that would be generally desired by are not implemented on conventional OS because it is too inconvenient to do so?

I'm not sure that it is so unusual a case. Confinement is pervasive in capOSes. In fact it's the default. It's also completely absent in conventional OSes, and can easily be blamed for the pervasive virus problems.

POLA absolutely requires any newly constructed objects to start with no authority, ie. confined, and thus unable to access or interfere with other objects until explicitly granted authority. POLA itself is unachievable in other operating systems as a result. Plash and HP's Polaris try as far as possible to implement POLA, and they do a decent job for a limited domain, but they are fundamentally limited by the poor access control primitives of the OS. Is that the sort of example you're looking for?

Can you give a canonical example of what you have in mind?

Communicating Conspirators.

I already understood "ambient authority" in the way you describe above, and I've already expressed my disagreement with your last statement above. My argument is precisely that it is not straightforward to map some security analysis situations to that model. The example I gave involved a complicated web access protocol.

Reviewing the discussion so far, I agree with shap that the capability always names on a single object, though that object may be a directory/collection of other capabilities. I didn't see a specific web example discussed though. Did I miss it? Perhaps the Waterken web server can answer some of your questions re:capabilities for the web.

As for the user account/MAC address, the MAC address can be forged and MAC addresses are sent in the clear, so I'm not sure it's really equivalent. Any security model dependent on MAC is already trivially broken. But I kind of got lost in the discussion here, so I'm not sure what sorts of security properties were being discussed.

Various responses

I'm 99% sure that I haven't critiqued any straw men, and 100% sure I didn't borrow any from Wikipedia.

Perhaps my wording was unfortunate. My concern is that you seemed to be (indeed you stated that you were) responding to the Wikipedia definition. The Wikipedia definition is correct in identifying the traditional defining properties of capability systems: (1) fusing designation with permission/authority, and (2) ensuring that the resulting designators unforgeable. Unfortunately, there are several ways to achieve the second property, and it matters crucially which one is used. The three I know about are:

  1. Partitioning - which may be accomplished through static typing (e.g. in PL systems) or through OS-protected data structures.
  2. Encryption/Sparsity - which has several variants, but the key point is that capabilities and data are not partitioned, so I include sparsity-based approaches in this category.
  3. Password capabilities - which combine some properties of each.

The reason this distinction matters lies in the capability/data distinction. In a non-partitioned approach, transmission of data generally cannot be distinguished from transmission of capabilities, because they are both "just bits". Since a great many security properties (indeed, every one I can think of) rely utterly on being able to distinguish between data transmission and authority transmission, encryption-based capability systems are not particularly useful in practice. The issue isn't guessability. It's the fact that capability transmission (either overt or covert) cannot be mediated in these systems, while in partition-based systems it can.

Password capability systems present a fairly odd middle case. They tend to promiscuously disclose object identity (because "get capability as bits" is merely the identity function), but the requirement that the TCB must validate and decrypt capabilities before use, and that this validate and decrypt function can only be performed within the TCB, and that the v&d key is traditionally unique per-process can be used to turn a password capability system into a partitioned capability system for purposes of mediation.

Now the reason I am going into this level of detail is that the consequences of these differing design possibilities have such far-reaching implications for what a given system can or cannot do that trying to speak about capability system security "in general" is largely pointless. The formal semantics of the two systems are foundationally different - non-partitioned capability systems have only the read and write operations, and not the take and grant operations. The absence of a distinction between these operations is what led to Earl Boebert's famous, widely cited, and mistaken assertion that capability systems cannot implement confinement (and, in consequence, cannot implement isolation). Earl, by the way, agrees that his conclusion was insufficiently qualified. We both agree that his conclusion is correct for all non-partitioned capability systems.

Since the non-partitioned schemes (note I am not putting password capability systems in this category) cannot enforce any security policies of interest, I am largely uninterested in them. The systems of interest are those that satisfy the Take/Grant model originally proposed by Snyder (I am exceptionally embarassed that we somehow failed to cite him in our verification work - the SW model that Sam Weber and I put together has some practically important enhancements, but it's not any sort of radical extension). I confess that my concern for mediating authority flow, and my consequently complete disregard for encryption- and sparsity-based capability schemes, sometimes leads me to be insufficiently specific when I speak of capability systems.

Also, just to be clear: my objection here isn't to encryption in the transport between cooperating TCBs. It is to the absence of partitioning as perceived by the application wielding the capability.

I'm less sanguine about Wikipedia's definitition of capability-based security, which to my mind, extends well past least-privilege design these days. It started there, perhaps, but a very powerful set of patterns and idioms has now arisen such that the notion of capability-based security is a bit broader than it once was. Wikipedia perhaps suffers from the curse of all encyclopedias: they can never be up to date on a rapidly moving field.

On to terminology...

I'm happy (indeed I prefer) with your use of the term "capability". It has the advantage that it has a defined and agreed meaning. My problem is that I am largely unable to respond to your propositions and views about capability security without knowing, first, whether you are discussion in the partitioned, non-partitioned, or password capability design spaces. In some cases, I will need to know more specifically whether you mean to refer to object-capability systems (all of which are partitioned) or something else. I think that a lot of our mutual confusion has been failure to identify a common operational model in which to speak.

One property of what I mean by a permission unit is that there is not necessarily any mapping between a given permission unit and a protected data structure in the kernel space of an OS...

For purposes of this statement, it doesn't matter if we are speaking about an OS kernel space or an object whose protection is maintained by a suitably safe language runtime.

Also, I believe we should set aside the distraction that a single conceptual object may be concretely implemented by multiple protected data structures. In such a case there is always one data structure that "is" the object, in the sense that references to that data structure, and consequently the identity of that data structure, are the property that distinguishes one conceptual object from another. For example, the structure holding the register set of a process may be different from the primary structure that captures the process abstraction, and may be allocated in a different allocation call, but it is not independently designatable and therefore lacks identity or object-ness at the capability layer of abstraction. Your file sub-structure case is different, and I'll get to that in a moment.

So with those distractions set aside, my remaining response is that I have never observed any security model in which separation of designation and permission did not prove to be utterly fatal to all policy enforcement. Indeed, it is an implication of the modelling and formal verification work done by Harrison, Ruzzo, and Ullman that any such separation is and must be decidably unsecurable.

I criticize the generality of a meta-theory of authorization/permission models that places strong restrictions on what the models can call objects. A file is not an indivisible unit, and people are sometimes interested in placing different permission restrictions on different parts of a file.

I concur, but we have now devolved to a discussion of the meaning of the term "object". I would argue that in your scenario, either the sub-structural pieces have different identity, in which case they are first-class objects and we are back to singleton designation, or you are engaged in an exercise in access filtering, in which case this is an example of dynamically instituted policy. Which is fine, but it introduces layering (also good) and therefore potential confusion about which layer we are speaking about and which interpreter is (must be) responsible for identity, permission, and enforcement at a given layer.

But your file example is qualitatively different from the webserver session vs. web files distinction. The difference is that the client of the webserver cannot, in principle, have a capability to any resource resident on the web server. It can only have a capability to a communication channel which, by mutual agreement, is demultiplexed at several levels, ultimately by the webserver, and whose message payload the web server uses to decide what operations to perform on the client's behalf.

Unfortunately, getting down into the grit of this set of distinctions is fairly crucial to understanding what is really going on security-wise. At a minimum, you have multiple levels of abstraction engaged in this case that are performing hierarchically structured security and permission checks and mux/demux on either end. At a higher level of abstraction where all of these agents are considered part of the reliance infrastructure, we can talk about a cross-machine capability of this sort. At a lower level of abstraction close to the machine, we cannot.

And this is where the core algebra issue and the extensibility issue come into play. The notion of an "extensible capability system" allows capabilities at various levels of abstraction to be wielded by a caller interchangeably. On the other hand, the fact that the semantic interpretation of the operations authorized by those capabilities cannot be interpreted at the "level 0 interpreter" means that enforcement of permissions on higher-level capabilities by the level-0 interpreter is problematic. And this is why I stated elsewhere that capability systems are conceived as an extensible platform for security policy enforcement, not a complete solution in and of themselves.

Where is this particular core capability algebra that you speak of defined?

The take/grant model of Snyder et al. (in it's later, mature form that dropped call) is the foundational model. In practice, the model that Sam Weber and I built is probably more directly relevant. Note that this is not (and should not be) a general computation model.

To my knowledge, there is no second version of the core capability model that has ever been shown to enforce a useful security property. I've certainly seen minor variations on the model, but nothing that (to my mind) deserves to be characterized as "different but equivalent".

However, this should be viewed as the model at "level 0" of the object interpretation hierarchy. At higher levels, interpretation is done by arbitrary programs. This is inconvenient, because going to a Turing-complete model usually deprives us of the ability to reason about what (if any) properties are being enforced, and whether they are being enforced correctly.

Are you claiming that the expressive power of software running on a capability based OS is theoretically different from software running on current mainstream operating systems? Do programs running on Coyotos have observable security related behaviors that are not realizable by software running on other modern operating systems running on the same hardware?

To both questions: absolutely I am claiming this, and the distinction is that there are information and authority flow security policies in Coyotos that are both theoretically and practically enforceable (and also in W-7 and some other oCap systems), and there are no information and authority flow security properties that are theoretically or practically enforceable in any other permissions model that presently exists in the literature. I do not mean "it is unknown" whether these other systems can enforce. I mean that they have been formally examined and shown not to work.

You're saying, I think, that you don't intend capabilities to do the job of talking about that other level, and I'm saying, a different setup than the one in the "Myths" paper is needed for talking about other levels.

I think I agree, and it's important to remember that at the time Myths was written, Boebert's statement was considered definitive, and capability systems were still largely considered a dead end in the field as a whole. The goal of that paper was to show that a key set of pre-conceptions in the field were wrong so that discussions like the one we are engaged in would be possible. Think of that paper as a very early step in a long re-education process, not the final word.

For example, there is a key role for protected designation and intermediation independent of who interprets operations. I'm also saying that the inseparable fusion of designation and authority is necessary in order to be able to reason successfully across the layering boundary. In an ambient authority system, you cannot denotationally connect the designator on the client side to the behavior on the service side, and in consequence it is impossible to do end-to-end reasoning across layers at all.

It is largely this denotational connectedness between capabilities and services that got Mark Miller and I so excited about capabilities. I stated elsewhere that capability semantics are essentially those of the lambda calculus with side effects. The denotational connectedness is what allows us to tie the invoked capability to the receiving application-level continuation.

The ability to connect behavior across layers of abstraction -- and potentially to reason thereby about the behavior of trusted programs in an end-to-end way -- appears thus far to be unique to capability systems. I would not rule out the possibility of other constructs in future that provide this, but today there aren't any. It is difficult to conceive of how cross-layer reasoning can be done without a designation regime that operates in a cross-layer fashion, and equally difficult to see how that reasoning might proceed sensibly in the presence of ambient authority. And of course, any system observing those two constraints is a capability system (indeed, it's an extensible oCap system).

Various follow ups

Sorry for the delayed response. Your post contained a lot of material and some of the issues are subtle.

My concern is that you seemed to be (indeed you stated that you were) responding to the Wikipedia definition.

No, I had something equivalent to that definition in mind when the thread started, and when it became clear that conflicting terminology was being used within this thread I searched for statements of other people's concepts and definitions.

The Wikipedia definition is correct in identifying the traditional defining properties of capability systems: (1) fusing designation with permission/authority, and (2) ensuring that the resulting designators unforgeable. Unfortunately, there are several ways to achieve the second property, and it matters crucially which one is used. The three I know about are:

1. Partitioning - which may be accomplished through static typing (e.g. in PL systems) or through OS-protected data structures.
2. Encryption/Sparsity - which has several variants, but the key point is that capabilities and data are not partitioned, so I include sparsity-based approaches in this category.
3. Password capabilities - which combine some properties of each.

The reason this distinction matters lies in the capability/data distinction. In a non-partitioned approach, transmission of data generally cannot be distinguished from transmission of capabilities, because they are both "just bits".

I don't think that's right. Actually, I'm not completely sure what the claim above means, but I'm going to hazard a guess. Let's say A transmits a given permission unit to B, but does not want to give B the authority to transmit that permission unit to Process C. The worry might be that if B independently has the ability to send bits to C, perhaps B could somehow transmit the permission unit to C anyway. This worry doesn't trivially apply to crytographic systems because things like digital signatures are unforgeable. But it would be a real worry if a particular system setup allowed B to basically transmit its identity to C - e.g. if the identity of B was established by a using a particular private key. However, if the designation of A, B, and C are kept rigid in some way by the given system, then digital signatures can reference these designations in the permission unit and there is no problem -e.g. if A, B, and C are OS processes that are named by some combination of process id and system time interval then the OS keeps track of which is which. If they were unique binary blobs, then a cryptographic hash could be used to rigidly name them. Other methods would apply to other situations (e.g. designation by IP address, MAC, e-mail address, etc).

If the discussion above doesn't capture what you were thinking about, then please give me an example.

Now the reason I am going into this level of detail is that the consequences of these differing design possibilities have such far-reaching implications for what a given system can or cannot do that trying to speak about capability system security "in general" is largely pointless. The formal semantics of the two systems are foundationally different - non-partitioned capability systems have only the read and write operations, and not the take and grant operations.

It sounds like my guess above was pretty close, and I argued that the summary above is not true for some categories of cryptographic systems. With regard to passwords, I think it would be better to be able to analyze setups where the security regime depends on assumptions about the behavior of users so long as those assumptions are made explicit by the analysis. I mean, basically every computer security regime depends on assumptions about the behavior of the owner or system administrator, including how they use passwords or equivalent physical tokens; it would often be appropriate to make those assumptions explicit as well, especially for a distributed system.

I have never observed any security model in which separation of designation and permission did not prove to be utterly fatal to all policy enforcement. Indeed, it is an implication of the modelling and formal verification work done by Harrison, Ruzzo, and Ullman that any such separation is and must be decidably unsecurable.

I can't comment on what you've observed or your generalization process. But the use of HRU above seemingly involves a subtle but significant quantifier mixup. HRU give a formal description of a type of ACL system and show a mapping from the halting problem to logical questions about the behavior of that system. The mapping involves picking out special configurations of the ACL system that encode configurations of the Turing machine. HRU do not directly show that questions about every configuration of the ACL system can be encoded as a halting problem, and a little reflection shows that there are obviously some states of the ACL system that are pretty trivial to reason about. If the real ACL system you care about has the property that every logically expressible configuration is actually and meaningfully reachable from every other one via well formed versions of the type of operations that one needs to reason about then the quantifiers can indeed be reversed. But that's simply not the case in the real world. The naive sysadmin's model of their security regime is essentially inductive - start with a secure state and only use operations that preserve the security of the state. The sysadmin doesn't in practice believe they can do any reasoning about a weird ass system state that picked out for the artificial purpose of satisfying a mapping from a given halting problem. For the reasons I just stated, HRU does not argue that an inductive proof of that form would either fail to find a base case or would fail in the induction step provided there were meaningful constraints on the particular sequences of operations considered to be secure. Such a proof strategy might fail for some other reason, but not because of the HRU result.

I criticize the generality of a meta-theory of authorization/permission models that places strong restrictions on what the models can call objects. A file is not an indivisible unit, and people are sometimes interested in placing different permission restrictions on different parts of a file.

I concur, but we have now devolved to a discussion of the meaning of the term "object". I would argue that in your scenario, either the sub-structural pieces have different identity, in which case they are first-class objects and we are back to singleton designation, or you are engaged in an exercise in access filtering, in which case this is an example of dynamically instituted policy. Which is fine, but it introduces layering (also good) and therefore potential confusion about which layer we are speaking about and which interpreter is (must be) responsible for identity, permission, and enforcement at a given layer.

The example was meant to show the potential for confusion and also to be essentially a real world example. As I said originally, the problem was not in identifying ambient authority at the micro level, but rather in using that distinction at the composite level. In fact, I think it doesn't compose well even when we stay at the same level, and I hope you'll forgive me if I illustrate that with artificial examples. Consider an analogy to computing the ohms of an electrical circuit connection - amount of ohms is supposed to correspond roughly to amount of security in this analogy. We have one computation for elements in series, where they add, and another computation for elements in parallel where the overall resistance is lessened because there are multiple paths. In the artificial example, let's say that gaining a given capability depends on two or more permission units, one of which involves ambient authority. Case 1 is the series case, where the actual protocol requires using the ambient authority and the other non-ambient permission unit. Case 2 is the parallel case, where the actual protocol accepts either the ambient authority or the non-ambient permission unit. My criticism here is that the definition of using ambient authority doesn't do the right job of helping to assess the strength of the security. In the Case 1, the ambient authority is used to make the protocol stricter and in Case 2 it is used to make it weaker.

Are you claiming that the expressive power of software running on a capability based OS is theoretically different from software running on current mainstream operating systems? Do programs running on Coyotos have observable security related behaviors that are not realizable by software running on other modern operating systems running on the same hardware?

To both questions: absolutely I am claiming this, and the distinction is that there are information and authority flow security policies in Coyotos that are both theoretically and practically enforceable (and also in W-7 and some other oCap systems), and there are no information and authority flow security properties that are theoretically or practically enforceable in any other permissions model that presently exists in the literature.

This same issue came up in naasking's recent post and my response. Your use of expressive power seems inconsistent to me, in the sense that you began with saying that your OS had the expressive power to enforce various authorization policies for the USB pen drive using software programs running on top of the OS to realize those policies and I think the response to the file sub-structure case was along the same lines; whereas in the confinement case naasking was saying that the conventional OS had less expressive power because it would have to use a special software setup running on top of the OS to enforce confinement. The rhetorical point isn't too important, whereas I understand that the issue is a practically important distinction for Coyotos. But the apparent rhetorical inconsistency was a motivation for posing the question.

I do not mean "it is unknown" whether these other systems can enforce. I mean that they have been formally examined and shown not to work.

What is the best formal description of the containment argument?

I'm also saying that the inseparable fusion of designation and authority is necessary in order to be able to reason successfully across the layering boundary. In an ambient authority system, you cannot denotationally connect the designator on the client side to the behavior on the service side,

I'm not sure how these claims map in detail to the discussion above. Are they a recapitulation or something different?

Distinguishing Transmission of Data and Authority

Since a great many security properties (indeed, every one I can think of) rely utterly on being able to distinguish between data transmission and authority transmission, encryption-based capability systems are not particularly useful in practice. The issue isn't guessability. It's the fact that capability transmission (either overt or covert) cannot be mediated in these systems, while in partition-based systems it can. [...] Since the non-partitioned schemes (note I am not putting password capability systems in this category) cannot enforce any security policies of interest, I am largely uninterested in them.

I agree that there are many reasons to not widely distribute authority to "get capability bits" or "turn capability bits into authority". If all you were objecting to is having this capability as an ambient authority (given to every new process or service), then I'd agree.

But it seems to me your assertion is equivalent to insisting that E doesn't support any security properties (that you can think of) once you reach the inter-vat communications layer. Basically, you're saying that everything written at Walnut/Secure_Distributed_Computing is so much bull.

Do you deny this? If so, then you need to clarify and qualify your assertions. If not, then how do you justify that belief?

Anyhow, I explain below that the difference between code and data is ineffective, and that what really must be protected is the ability to 'act upon' information in a far more general sense (whether capability bits or other secret code/data bits).

In an OS, capabilities are

In an OS, capabilities are kept separate from ordinary data. Each process has its own data address space and its own capability address space. Each system call requiring a capability accepts the index into its capability space and the kernel operates on the capability representation directly. The capability space and representation is never directly exposed to user space. This is the OS equivalent of object-capabilities, and achieves equivalent safety properties to memory safety in programming languages.

Josh Stern's public key approach is weaker than the capabilities are just probabilistically unforgeable. These capability variants are known as "capabilities as keys/data".

Capabilities kept separate from user data....

capabilities are kept separate from ordinary data. Each process has its own data address space and its own capability address space. Each system call requiring a capability accepts the index into its capability space and the kernel operates on the capability representation directly. The capability space and representation is never directly exposed to user space.

Isn't this just a roundabout way of reiterating my conclusion that the OS has to keep track of who owns which capabilities? The capability address space, which is never directly exposed to user space, must therefore be managed by the operating system. Managing the capability address space, which must contain a table of capabilities, for each and every process is exactly the same thing as keeping track of which processes have what capabilities. Isn't it?

Not really

You're confusing ownership with partitioning. The test isn't whether I "own" a capability. The test is whether I "posess" a capability. The capability itself is in no way tied to me. I can send the capability to you via a message, following which you posess the capability co-equally.

Revocability seems to require a known chain of derivation.

Capabilities are supposed to be subdivisible, delegable, and revocable.

So let's say the Sysadmin or OS gave user Charlie the authority to read and write in /usr/home/Charlie. Charlie starts up a command shell, with a reference to his own capability profile. Through the command shell, he starts up process A, a file-sharing/chat program, creating for it a capability to read and write in /usr/home/Charlie/friends/shared. It communicates this capability to another process B.

But B is a bad actor or corrupted, and communicates it to a botnet. The friends/shared directory starts filling up with spam, kiddie porn, contraband, terrorist bomb plans, malware, credit card information, etc, as the botnet uses it as a hub to distribute illegal goods.

Charlie, behind the command shell, observes this in dismay, and quickly issues a command to kill-9 process A (revoking the capability A had to write in friends/shared). The flood of spam stops. Well and good, right?

How the heck does killing the process A relate to revoking its capabilities unless the program that created the capabilities (in this case Charlie's command shell) keeps track of what process it issued them to (ie, the 'owner' of the capability)?

On a higher level, say Charlie doesn't notice it and the sysadmin notices /home/Charlie/friends/shared filling up with crap. What's the sysadmin need to do? He needs to revoke the capability that's being abused. If the OS doesn't see the structure of capabilities and owners, it doesn't know how the caps it issued have been subdivided and shared. The only capability the OS knows about that the sysadmin can revoke is the one it issued. That cap is the one that allows Charlie to read and write in /home/Charlie. Revoking it means effectively locking down Charlie's whole account rather than just the cap that Charlie ill-advisedly shared.

Why, again, would the OS not want to keep track of who owns what caps? What would it mean if the sysadmin doesn't know to whom that capability was issued? Even if it can be revoked without that knowledge, it would mean "fix the bug," that's what. Security demands accountability.

Finally, these subdivided capabilities exist in relationship to each other, right? When the command shell revokes A's cap to read and write in friends/shared, the derived cap that was shared with B, and which it passed on to other programs, also has to stop working.

When the sysadmin revokes Charlie's cap to read and write in /home/Charlie, the derived cap that the shell got from charlie to do the same has to stop working, and the derived-and-subdivided cap that A got from the shell to read and write in /home/Charlie/friends/shared has to stop working, and the shared cap that B got from A has to stop working, and the derived-and-shared cap that every last botnet process across the world is using has to stop working.

So how does 'shared equally' and 'no need to keep track of ownership' work if there needs to be a chain of authority which can be broken via revocation at any link?

Granted, one can use public-key crypto or something to obscure this derivation information. But I don't see any motivation for the creator of the capability to do so.

Revocation Patterns

In capability systems, revocation patterns are generally explicit. It is not the case that killing process A would necessarily kill process B's access to /home/Charlie/friends/shared. One might express revocation as dependency (the cap to the shared folder depends upon some other capability, perhaps one associated uniquely with process A) or via some sort of action (_atexit), though the latter is far more fragile (i.e. would be unlikely to fire for a kill -9).

A sysadmin would want the ability to kill the abused capability. It would not matter who possesses it. If Process A wishes to protect itself against Process B's abuses, it can duplicate the capability and send the duplicate to Process B, or it might follow a Horton pattern.

Of course, expressivity is certainly going to be limited for 'capabilities' hidden behind an OS layer. Things get interesting when process A can create its own capabilities then share those with process B (who may then share them with process C, without telling A), as a basis for secure IPC.

In object capability

In object capability systems, revocation is usually via some sort of Caretaker pattern. In other words, we add levels of indirection at points in the object graph where we anticipate revocations occurring, and a revocation simply breaks the delegation chain at that point. This is how B can delegate to C and D, and D can delegate to E, and if D revokes his grant to E, the capabilities C and D hold are unaffected.

It should also be noted that a capability and its delegated version are generally indistinguishable, ie. there is no "EQ" primitive. This permits the transparent interposition necessary for the Caretaker pattern to work.

Sometimes EQ is re-introduced as a closely held capability in its own right to enable certain patterns that require it, but the examples you are considering don't need it.

Finally, as David noted, responsibility tracking can be built atop capabilities via the Horton protocol, and with that you can build all sorts of quota-like policies.

In summary, you must first unlearn what you have learned. There is no identity and there is no ownership. :-)

In summary, you must first

In summary, you must first unlearn what you have learned. There is no identity and there is no ownership. :-)

Perhaps it is possible to build a system in which identity and ownership of capabilities are not tracked and still have the system work.

What would motivate doing so?

This is a serious question. If I'm designing for security, I would need a very compelling reason to *NOT* track identity and ownership, at least to the system boundary. So, even granted that it's possible, why would anyone build a system that way?

The radical notion behind

The radical notion behind capabilities is that identity and ownership are actually less than useless for reasoning about security. All you need to consider is the possession of delegable permission tokens to reason about authority and the flow of information, and everything else can be built atop this foundation.

In fact, I would go so far as to say that trying to introduce identities as a primitive notion is actively harmful, because you start to make promises you can't keep. For instance, ACL models simply can't make correct access decisions involving more than 1 principal, the safety problem in the ACL model is undecidable (but is decidable for capabilities), and ACLs allow you express restrictions on communication that it can't actually enforce.

There are many results along these lines, and as an introduction to how capabilities manage the problems you're trying to reason about, I recommend Capability Myths Demolished.

Identity and ownership are

Identity and ownership are poorly defined. Who 'owns' an interaction between two processes? What role does 'identity' play in a particular decision? Capability discipline rightly discourages use of these ill-defined properties for anything, including system administration.

Identity or ownership is meaningful to the extent one attaches liability to it (legal, fiscal, reputational). What you really want is direct expression of liability. Liability is useful for developing a trusted service or resource market.

However, maintaining useful identities is expensive, and liability issues quickly become muddied even for simple service interactions. Further, 'liability' is too easily confused with 'security' even by people who should know better. These are reasons to limit widespread tracking of identity and ownership.

I favor supporting liability from within the model, i.e. via Horton or via use of 'purses' or 'smart contracts', or as part of a model for a resource/service market.

A real world analogy

Perhaps it is possible to build a system in which identity and ownership of capabilities are not tracked and still have the system work.

What would motivate doing so?

This is a lot simpler than sometimes it sounds.

Consider how people do security for their back yards and their cars.

Let's say my neighbour looks into my back yard and sees me standing there. He says to himself "That's Marc, and he owns that house, so that is OK." But if he sees someone he doesn't recognize there, he may say to himself "Hmmm, that's not Marc who owns that house, it could be a trespasser, maybe I should call the cops."

That's security by ownership and identity.

On the other hand, for a car, there is a key to the ignition. Having the key to the ignition allows me to drive the car. If I hand the key to a valet to park the car, that allows him to drive the car. If I don't have the key to my car, it doesn't matter that I'm me or that I own the car, I can't drive it.

That's security by capabilities.

Now just as in these real world examples there are some complexities such as how do you prove identity and ownership to a stranger, or how hard is it to copy a key, or protect keys from fraudulent copying, these are just practical details to make your security system work, not conceptual underpinnings.

If you want to understand the strengths and weaknesses of these systems, and why you would choose one or the other for a particular purpose, try imagining if we reversed the systems for cars and back yards.

Hilarity ensues. ;-)

Let's say my neighbour looks

Let's say my neighbour looks into my back yard and sees me standing there. He says to himself "That's Marc, and he owns that house, so that is OK." But if he sees someone he doesn't recognize there, he may say to himself "Hmmm, that's not Marc who owns that house, it could be a trespasser, maybe I should call the cops."

The problem being, every notion of identity used in a security model I've seen hinders delegation which is needed for Principle of Least Authority (POLA). Ultimately, you shouldn't use identity to make access decisions, but you can use some notion of identity to assign blame (this is what Horton enables).

As for using capabilities to control access to a house and/or backyard, the key analogy works just as well: your neighbour might think it unusual that someone they don't recognize is in your backyard, but can assume that he's there legitimately given he had a key to get in.

The translation to computers

The problem being, every notion of identity used in a security model I've seen hinders delegation which is needed for Principle of Least Authority (POLA).

Lest the utility of the analogy get lost in the shuffle to get back to o-caps ;-), let me make explicit what I tried to communicate implicitly in the previous post.

The back yard system works because it relies on the informal judgment of humans. My neighbour will not call the cops, for example, if the person looks like a family member, or if it is someone who looks like a handyman mowing my lawn.

When you translate the "back yard model" to a computer, the computer either must "call the cops" when identity and ownership can't be established or you need some complicated formal system to handle proxy authority, which gets messy fast. (Which is the upshot of your point)

The car key model, on the other hand, is a mechanical solution: the car doesn't need to know anything about me or make any decisions, it just has to work with the right key, and not work otherwise.

Computers are more like cars than they are like neighbours, which is what makes the key system analogy a better fit for an effective security model on computers.

Worse than that

Let's say my neighbour looks into my back yard and sees me standing there. He says to himself "That's Marc, and he owns that house, so that is OK." But if he sees someone he doesn't recognize there, he may say to himself "Hmmm, that's not Marc who owns that house, it could be a trespasser, maybe I should call the cops."

You look in your neighbor's yard, and conclude wrongly that the absence of Steve (who is not Marc) means that everything is okay. This is analogous to deleting Steve from an access control list.

But in fact, Marc could be colluding with Steve, so the absence of Steve from the yard tells you nothing at all.

Two types of revocation

There are two types of revocation. One revokes all capabilities to an object. This is better imagined as an "object destroy" operation. It does not require any tracking of the chain of delegation.

The other revokes a particular delegation. This is only feasible in capability systems that provide some sort of transparent "wrapper" object, and it is implemented by inserting a wrapper at delegation time and later destroying the wrapper. The effect is to revoke all capabilities whose delegations are "downstream" of the revoked wrapper. From an authority management perspective, this "revoke everything downstream" behavior is what you want. Seeing why that is true may take some pondering.

Excuse me? Did I give you cause to think I'm stupid?

Why would that take any pondering?

Someone using your credential has the ability to mess up your resources! Of course you want the ability to revoke their access to your credential without revoking the credential itself! If I couldn't revoke downstream access, I'd never delegate most caps in the first place!

That's the problem with email addresses, dangit. I once gave people the capability to email me by handing them my email address. Some one or two or three or six-thousand of them have 'delegated' that capability to spammers. So now I dearly wish I could revoke the cap from those several - and everybody in the downstream from them because the spammers who've got it are *still* passing it on to more spammers - without also revoking it from my current friends and people they pass it on to.

Alas, structured delegation with downstream revocation is not (yet!) part of email addresses, so we soldier on under a mountain of spam. But if anyone has to 'ponder' in order to understand the need for downstream revocation, I strongly suspect they're either stupid or drunk, or that they don't have an email account.

Expanding...

Excuse me? Did I give you cause to think I'm stupid?

Not yet. :-)

What most people find puzzling about "revoke everything downstream" is the situation where Alice delegates to Bob, Bob then delegates to Carol. Now Alice wants to revoke Bob without revoking Carol. For example, Bob is leaving a work group that previously held all three parties. Under the "revoke everything downstream" rule, this type of selective downstream revocation is impossible without resorting to other measures and prior planning.

Most people, when they trip over this example, immediately start clamoring that "revoke everything downstream" must be the wrong thing. From experience, it usually takes quite a while to convince them that (architecturally speaking) it is nonetheless the right place to start.

Aside: the entire notion of "your" credential in capability systems is mistaken, which is the start of a long series of common misunderstandings. First, there are no credentials in the sense that the term is usually used in the security literature - a credential stands in lieu of a principal. I think you meant "authority". Second, there is no notion of authority ownership in a capability system. The notion of ownership can be re-built at higher levels, but it isn't part of the core protection system.

What shap said. In PLT

What shap said. In PLT terms, consider a simple concurrent language with a term spawn: 'a -> ('a -> ()) -> (). Now you execute:

let x = openFile "/etc/passwd" in
  spawn x \file -> (* do something *);
  spawn x \file -> (* do something else *)

'x' is a capability to the file, and both spawned processes possess that capability, ie. have co-equal access to it, but neither of them "owns" it in any meaningful way.

The capability address space, which is never directly exposed to user space, must therefore be managed by the operating system.

Indeed, there are persistent data structures that store capabilities in "capages" and ordinary data in "pages". A process running in these pages and capages does not own them though, it merely possesses capabilities to them. For instance, it possesses capabilities to read/write some data pages (for data), and capabilities to read/execute some other data pages (for code), and capabilities to load/store capabilities in capages.

Capabilities are not always Unforgeable

Within a capability programming language, it is generally impossible to express forgery of capabilities. One can achieve something similar for inter-process communication, given a mutually trusted kernel. However, when we scale up to the distributed case, we settle for 'cryptographically unguessable'. That is, we make a large secure-random strings (i.e. of perhaps 160 bits), and we call those capabilities.

In the distributed case, it is important that you not just send a capability on the wire, lest some man-in-the-middle start stealing capabilities from you. Thus, you must either authenticate the system against man-in-the-middle or you must use another mechanism to prove you have the capability without actually sharing it. Cryptographic techniques can work for these latter mechanisms; i.e. you could represent the capability as an HMAC key separately from the routing ID. This would allow you to route across an overlay network, and authenticate your messages, without actually sharing the capability.

One may add a watermark to capabilities such that we can distinguish random forgery attempts as distinct from expired capabilities. There isn't much one can do in the face of random forgery attempts, though, since the actual blame may lie multiple indirections behind the final messenger, and any action might be usable for a denial-of-service attack against said messenger. So, generally, the proper action for a request with a bad cap is just to politely deny the request.

In any case, there isn't much reason to worry about people guessing random 128 to 256 bit strings any time soon. And, if it ever becomes a problem, we are free to enlarge those strings. This is not at all the same as 'guessing' pointers. Pointers are often trivially guessable - as far from 'cryptographically unguessable' as you can get.

Nothing here is forgeable

If you used an encrypted channel, then the 64-bit or 128-bit nonce is unguessable. If you didn't, then it can be picked up off the wire and there is no need to guess it. In either case, the use of wire encryption and end-to-end authentication is essential to extend a capability system across a network. In either case, that nonce should never be revealed to user code - there are strong security reasons to want to maintain a partition between capabilities and data.

So, generally, the proper action for a request with a bad cap is just to politely deny the request.

A "bad" cap, by definition, is one that fails to name an object. If it fails to name an object, then there is no one to deny the request. In general, in a properly designed capability system, bad caps should be structurally impossible in the same way that bad object references are structurally impossible in a safe programming language. If a malformed cap is encountered, the right thing to do is to panic the machine immediately, because the mere existence of such a capability is a prima facia proof that the entire security basis of the abstract machine is compromised.

128-bit numbers are guessable 1/2^128 of the time

You shouldn't call such a quantity 'unforgeable' or 'unguessable' without a qualifier (I use 'cryptographically'). To do so confuses people, such as the author of this topic, who very often expect a statement without a qualifier to mean 'absolutely'. So, unless you mean to say absolute zero probability of being guessed, you should qualify your words. To do otherwise is misleading.

that nonce should never be revealed to user code - there are strong security reasons to want to maintain a partition between capabilities and data

I disagree. There are circumstances in which capabilities must be revealed to user code as data - for example, at the integration of capability security with browsers or bookmark files. The authorities to translate from a URI string to a capability or vice versa should be encapsulated by capabilities. However, it is not unusual that these authorities be revealed to some user code.

The security reasons to avoid this largely involve covert channels. That is solvable: don't give those sensitive capabilities to processes or services you are attempting to confine.

I am curious why you refer to a 'nonce'. A nonce, as I understand it, refers to a number that is used once. A capability should not be a nonce unless it is valid for at most one use. Nonce capabilities are a useful language feature, though, good for a number of patterns and for a direct manifestation of linear typing. (It turns out that nonce caps require some special support to implement in a highly concurrent language.)

A "bad" cap, by definition, is one that fails to name an object.

Yes and no. A "bad" cap is one that fails to fully name an existing object. For a partial name, or a name for an object that has since been destroyed, the message will often still be routed somewhere - i.e. to a vat in E. It makes sense for the vat to 'deny the request'.

Simply ignoring it is, of course, a denial. Whether that is considered a 'polite' denial or not depends on whether it hinders appropriate cleanup and auditing on the sender's side. (To what extent can Charlie hurt Alice by asking Alice to request invalid capabilities from Bob? You must ask this question of any distributed capability protocol.)

I offer example above of a scheme where the full ID consists of both a routing ID and a secret ID. This might be applied for routing on an overlay network for capabilities that need multi-node redundancy (a'la Chord, Pastry, or Tapestry). The routing ID can be freely plucked off the network, but the secret ID can still authenticate the message (e.g. using HMAC, or even per-message encryption).

in a properly designed capability system, bad caps should be structurally impossible in the same way that bad object references are structurally impossible in a safe programming language. If a malformed cap is encountered, the right thing to do is to panic the machine immediately, because the mere existence of such a capability is a prima facia proof that the entire security basis of the abstract machine is compromised

I disagree. Distributed capability systems are secure even if attackers are making guesses at capabilities. Panicking a machine (any machine) after a malformed cap is recognized would be an ideal vector for mounting denial-of-service attacks. Further, there are legitimate reasons for 'bad caps' to exist, such as heuristic garbage collection or certain forms of explicit destruction of objects. I can't begin to imagine what made you think 'panic' a reasonable response!

You shouldn't call such a

You shouldn't call such a quantity 'unforgeable' or 'unguessable' without a qualifier (I use 'cryptographically'). To do so confuses people, such as the author of this topic, who very often expect a statement without a qualifier to mean 'absolutely'. So, unless you mean to say absolute zero probability of being guessed, you should qualify your words. To do otherwise is misleading.

I agree. A better statement would be that in an OS capability system, guessing the bits of the capability don't help you, because there is a type-based partition between capabilities and data.

I disagree. There are circumstances in which capabilities must be revealed to user code as data - for example, at the integration of capability security with browsers or bookmark files.

This integration does not require disclosing capability representation. The use-case that does is sorting and/or uniqueness checking. I was trying to avoid getting into too many details at once. The hard requirement is that there exist no operation converting data into capabilities. That said, it is very desirable that the "disclose capability bits" operation be very closely held, since it inherently enables violation of encapsulation.

I believe that you are implicitly assuming a cryptographic capbility scheme in several of your statements. I was speaking in the context of OS-based capability systems, where a cryptographic scheme is usually inappropriate.

Re: OS-based capability systems

I assume open distributed systems to be the general case.

By "OS-based capability systems" you mean those involving IPC centered around a mutually trusted kernel and thus subject a common system administrator. In the domain of open distributed systems, these elements do not exist.

The 'knowledge' that a 'mutually trusted kernel' must be part of an OS seems to be one of those things that people must first forget before they'll even attempt to understand security-in-the-large. OS-based capability systems are an evolutionary improvement from what exists today, but are still far from the in-the-large security we really want... which must cover all parts of the software life-cycles (including distribution, integration, maintenance/upgrade, and deprecation/retirement), and must cross security domains and administration boundaries. We'll never have in-the-large security without focusing on the open distributed systems case. We should not assume a common trusted kernel, or administrator... not for applications, and not even for device drivers.

In a distributed capability system, the difference between capabilities and other data mostly involves how they tie into garbage-collection and communications protocol. The programming language sitting above this protocol need not have ambient authority to forge capabilities, but the authority itself is still globally pervasive, like an unum: every node in the entire network will possess it, and a secure implementation of an open distributed capability system must not assume or require that nodes utilize this authority responsibly.

In a distributed capability system, humans are agents. They may be granted capabilities... i.e. in e-mail, off the side of a bus, over the phone, by exchange of business cards, and so on. This requires that some services (such as browsers) be able to represent a capability as data or perform the translation from data to capability. This is the sort of user-story I'm imagining when I say that authority for such translations must be available to some user code.

it is very desirable that the "disclose capability bits" operation be very closely held, since it inherently enables violation of encapsulation

I was under the impression that the dangers associated with the capability to translate strings into power (or vice versa) relate to 'confinement', not 'encapsulation'. Encapsulation still allows for overt channels to untrusted services.

If you assume a code and data distribution model that requires a lot of confinement of effects, perhaps the problem lies with the assumed code and data distribution model rather than with the properties inherent to distributed capabilities.

I assume open distributed

I assume open distributed systems [lacking a common administrative regime] to be the general case.

Since these are, in principle, unsecurable, this doesn't seem like a good place to start.

By "OS-based capability systems" you mean those involving IPC centered around a mutually trusted kernel and thus subject a common system administrator. In the domain of open distributed systems, these elements do not exist.

A common administrator is not required. Only a common administrative regime. The difference is small but important. In a secure operating system, the operating system developer ultimately controls what decisions are available to the administrator, and can therefore be viewed as a meta-administrator. With care, it is sufficient to have a common meta-administrator and a (hardware) trusted platform in order to build a practically viable distributed trustworthy computing base.

The 'knowledge' that a 'mutually trusted kernel' must be part of an OS seems to be one of those things that people must first forget before they'll even attempt to understand security-in-the-large.

The requirement that a common trusted base exist is a precondition to security. Most security is constructed by induction. If you remove the common TCB, you remove the base case. This is one of those things that people consistently refuse to learn, which largely explains why there are no working examples of security-in-the-large. Other factors encroach as well, but if this one isn't resolved, fixing all of the others won't help.

How do you propose to recover the induction without the base case?

In a distributed capability system, the difference between capabilities and other data mostly involves how they tie into garbage-collection and communications protocol.

If you discard the common TCB assumption, then this may be true, but the resulting system is unsecurable.

In a distributed capability system, humans are agents. They may be granted capabilities...

This is incorrect. At most, humans are transports. In a secure distributed design, the bits handled by the human are enveloped in such a way that they can only be interpreted by the intended target system.

In general, there are no human agents in computing systems. Open up the case on your current machine. I challenge you to find a human inside. Only programs act as agents in computing systems. These programs interpret the requests of humans, but the actions of humans are fully mediated by the programs.

it is very desirable that the "disclose capability bits" operation be very closely held, since it inherently enables violation of encapsulation

I was under the impression that the dangers associated with the capability to translate strings into power (or vice versa) relate to 'confinement', not 'encapsulation'

The "disclose capability bits" operation converts authority into strings. So long as the resulting string cannot be turned back into authority (that is: so long as there is no rights amplifying operation in the system), you're still okay on confinement and isolation. But if "disclose capability bits" isn't closely held, you introduce the ability for the caller to alter behavior based on what is called. For example, if a hostile caller detects that the "open file" service has been front-ended by a filter, they can simply not do anything hostile. This is an example of encapsulation having been violated: the caller is able to discern something more than the alleged interface.

Confinement (and it's near cousin, isolation) are fundamental building blocks in constructing information flow and authority flow security properties. If you don't have them, you basically can't control much of anything.

If you assume a code and data distribution model that requires a lot of confinement of effects, perhaps the problem lies with the assumed code and data distribution model rather than with the properties inherent to distributed capabilities.

If you have a serious alternative to offer, it would be fascinating to hear about. I'm not aware of any viable alternative being proposed in the literature or the practice anywhere in the last 64 years. 1946, by the way, was the year that Eniac was deployed.

Distributed Systems Security without a TCB

I agree with the comment on induction. And I'll even agree that a TCB might be a convenient or necessary base-case for some security and software life-cycle story or another.

But you need to be careful: constructive proofs only tell you what is true, not what is necessary. A proof that a TCB is a step in a path to security under a given set of assumptions is NOT the same as proving one cannot achieve security without a TCB, nor the same as proving those assumptions valid.

As I recall, the security story that introduces the TCB happens to assume a common administrative regime. The story starts by protecting administrator interests from users. This idea is 'multi-user operating system', which harkens back to yesteryear when hardware consisted of expensive behemoths with clear ownership and users were the commodity. An assumption was that all interactions relevant to system security occur through the 'operating system', rather than directly between users. The operating system, thus, represented the administrator's interests. The "OS-based capability system" minimizes the role of user identity in deciding authority, but retains the monolithic administrative regime.

The basic requirements of security overlap both liveness and safety. The liveness requirements includes full use of those authorities you possess, including support for fine-grained delegation and remote accessibility. Without a liveness requirement, denial-of-service would not be a security violation. The safety requirement involves protection against unauthorized behavior. Without the safety requirement, the liveness requirement would suggest we just give everyone full authority at all times. A 'common administrative regime' is not inherent to these basic requirements.

In the distributed systems security story, interactions involve code distribution. Code-distribution, in the general case, is a necessary assumption in order to achieve liveness in the face of disruption. Without loss of generality, distribution of code subsumes all forms of 'data' (should you feel the code/data distinction is profitable). Users have diverse interests and authorities. They express their interests in code, often with support of a user-interface.

If Alice wants to interact with Charlie, this will involve distribution of code. Assuming Alice were sending code to Charlie, this raises the two basic security questions:

  1. How does Charlie control the extent to which running Alice's code might compromise his own interests?
  2. How does Alice control the extent to which sending code to Charlie might compromise her own interests?

Risk for Charlie from Alice is associated primarily with ambient authority. Charlie can feel safe about running Alice's code when Charlie can guarantee that running Alice's code on his local machine won't cause him any more harm than Alice could cause by running the same code remotely. This problem is half-solved by using an object capability language (with a large helping of unum pattern or its cousins); unfortunately, object capability languages often retain ambient authority for various denial-of-service attacks. Resource conservation (i.e. markets for CPU and Memory) and effective concurrency control (i.e. anti-deadlock) really must be part of the system in order to finish the job.

Risk for Alice from Charlie includes him stealing sensitive information for his own use, or running something other than the code she offered. Fundamentally, this means Alice must express which code (which information resources and authorities) she considers sensitive. Doing so in a fine-grained manner increases how much code can be distributed, and thus improves liveness and expressiveness properties. Alice can take advantage of other properties: to the extent she is vulnerable to bogus inputs from Charlie, she is free to send number-crunching code. And to the extent Charlie already possesses authority to sensitive data resources, she can send Charlie code that leverages it. Object capability model doesn't help Alice. What Alice needs is information-flow analysis, contagion models, and the ability to easily declare her concerns in her code.

The above descriptions assumed that Alice was sending code to Charlie. But other possibilities exist. To the extent that the services Alice is interacting with aren't sensitive or tied to hardware (actuators or sensors), Charlie might instead send code to Alice. If constraints forbid Alice and Charlie from direct interaction, there may be a mutually trusted third party that can host the interaction.

'Trust' finally shows up, albeit not in the form of a TCB. An expression of trust asserts faith that an grant of sensitive information or excessive authority will not be abused. Trust is generally limited in scope (to specific information or authority), asymmetric, non-transitive, and targeted towards individuals or groups that exhibit certain characteristics. In my labor, I've been expressing 'trust requirements' at the point where Alice describes certain information or authorities to be sensitive. Trust requirements are a specification of 'how sensitive' in terms of characteristics (usually signature authorities or certificates) a host must prove to possess in order to be entrusted with the excessive authority. A contagion model with explicit weakening spreads these requirements to other code in a link-time checkable manner.

Usefully, if trust is explicit in the language and model, then it can be applied inductively. For example, if Alice hands Charlie some code based on his meeting a trust requirement, Charlie will be able to see and grok Alice's trust-requirement annotations. Therefore, assuming Charlie really was worthy of the trust he was granted, he'll be able respect and honor those annotations when later interacting with Bob. This produces a sort of 'web of trust'.

Trust is related to liability via reputation: breaking trust can hurt reputation, if it is ever noticed. Other forms of liability include fiscal and legal. Accepting liability can be an important part of a service, since it also increases opportunity. If Charlie wants to make a living as a broker or middle-man, or selling CPU and memory resources in a competitive market, Charlie will need to accept various forms of liability. Trust and liability are useful insecurity properties: 'trust' allows Alice to increase expressiveness/performance/disruption-tolerance at the cost of increased vulnerability, and 'liability' allows Alice to reduce security risk by managing incentives surrounding the 'threat'.

Anyhow, all this code-distribution suggests a common language and communications protocol. But it does not require a common implementation, does not require an 'administrator' concept separate from common users. There is nothing you can point at in the system and say "this is a TCB!" in any meaningful sense of trust (faith that an grant of sensitive information or excessive authority will not be abused).

At most, humans are transports.

Sure... if you assume transports are able to decide such things as when, where, and whether to invoke (exercise or delegate) an authority.

But I think the word 'transport' would be more meaningful if we reserve it for communications mechanisms that limit decisions to context-independent routing.

the actions of humans are fully mediated by the programs.

Have you been working out of Stepford, by any chance?

This is an example of encapsulation having been violated: the caller is able to discern something more than the alleged interface.

The problem you described (involving the hostile caller) can be expressed using shallow equality on capabilities. Shallow equality on capabilities is useful if one plans to ever produce a set or database of capabilities.

In any case, I grant that a "disclose capability bits" capability offers a rather shallow form of reflection and might break encapsulation, and I've never once suggested this capability be distributed throughout code except where necessary. But, to the extent it offers a significant security risk, you necessarily have bigger problems in your communications protocol.

Confinement (and it's near cousin, isolation) are fundamental building blocks in constructing information flow and authority flow security properties. [...] I'm not aware of any viable alternative being proposed in the literature or the practice anywhere in the last 64 years.

I would say that web-applications have acquitted themselves quite well as a viable alternative. To a lesser degree, so has Croquet. And live programming techniques have also worked very well. But perhaps you misunderstood what I was saying with "a code and data distribution model that requires a lot of confinement of effects". I hope the above discussion helps clarify those words.

But you need to be careful:

But you need to be careful: constructive proofs only tell you what is true, not what is necessary.

I agree. It is possible that there might be another base-case model from which to induct, though it is hard to imagine how such a system could fail to require some form of axiomatic and predictable behavior. It is the axiomatic and predictable nature of the behavior that constitutes the "trusted" part of the "trusted computing base". So in the end, I think the base case will be called the trusted computing base by definition.

It is also possible that some non-inductive construction might be workable. But the nature of the problem of information flow is inherently a transitive reflection problem. So far as I am aware, all of the models for dealing with such problems formally are inductive. Perhaps someone will come up with something else that is wonderful. I would be interested, but I prefer not to hold my breath.

...the security story that introduces the TCB happens to assume a common administrative regime. The story starts by protecting administrator interests from users

I disagree. First, you seem to be confusing administrators with administrative regimes. The two are distinct, as my earlier comment about meta-administration suggests. Second, the story was at least as concerned with protecting users from administrators.

I agree with your comment that there are direct interactions between users. It is obviously not possible for a computing system to enforce rules on interactions that are conducted outside of that system.

I disagree with the characterization that the common administrative regime is monolithic. Ultimately, the notion of security is either defined or it isn't. What you are characterizing as a "monolithic administrative regime" is the existence of an enforceable, shared definition of axioms. While the specific policies of my organization and yours may disagree, it is not meaningful for us to speak about an agreed enforcement regime unless we first agree on the axioms from which our policies are formed. Only then can we attempt to reconcile the policies. This is true whether the system is monolithic or distributed.

Your point that there are issues which go beyond safety and information flow is well taken. Indeed I tend to focus first on information and authority flow, because if those cannot be sufficiently constrained the other issues pretty well seem to go out the window. Let's walk before we run.

Unfortunately, object capability languages often retain ambient authority for various denial-of-service attacks. Resource conservation (i.e. markets for CPU and Memory) and effective concurrency control (i.e. anti-deadlock) really must be part of the system in order to finish the job

How are these ambient? In all of the oCap systems I have seen that try to deal with this, the sources of resource are explicitly named and therefore are not ambient. The fact that the clients of those sources are subjected to a shared policy does not constitute ambient authority.

Anyhow, all this code-distribution suggests a common language and communications protocol. But it does not require a common implementation, does not require an 'administrator' concept separate from common users...

There are applications for which this may be true. Further, there are applications for which the level of required confidence is socially ensured. But there are counter-examples. For example: there is code that, in order to perform its function, requires certain guarantees concerning privacy from the host platform. No distributed application instance can sensibly hold a decryption key if the local host or its users can snoop the key. That privacy guarantee cannot exist without a remotely auditable TCB.

I agree that we might imagine a distributed system in which more than one "sufficiently enforcing" TCB might exist, and from which we might construct a collaborative distributed system. I have stated elsewhere in this thread that a common TCB is required, and your points illustrate that this is mistaken. What is required is some collection of auditably present TCBs that ensure the reliances required by the particular application.

This distinction is why (in spite of the fact that I routinely slip up) Mark Miller and I both prefer to speak in terms of reliances rather than trust. The term "trust" spills over into social intuitions that aren't always relevant, and this often leads to hidden or mistaken assumptions.

At most, humans are transports...

Sure... if you assume transports are able to decide such things as when, where, and whether to invoke an authority.

I have seen humans enter keystrokes, touches, and even toggle switches. I have seen programs, in response to these actions, invoke authorities. I have never once, in the entirety of my career, observed a human to invoke a computational authority. And neither have you.

Have you been working out of Stepford, by any chance?

Nah. Ossining.:-)

I would say that web-applications have acquitted themselves quite well as a viable alternative. To a lesser degree, so has Croquet.

I suspect in both cases the success has devolved from decreasing the reliance on information flow restrictions. Which is certainly a valid alternative to solving the information flow problems. I would be interested in pointers to the kinds of things you mean in context of Croquet.

David: I'm enjoying the exchange, and I hope that you are as well.

I have seen humans enter

I have seen humans enter keystrokes, touches, and even toggle switches. I have seen programs, in response to these actions, invoke authorities. I have never once, in the entirety of my career, observed a human to invoke a computational authority. And neither have you.

So, humans have the authority to wield capabilities, but not permission. ;-)

Ouch.

Ouch.

already walkin'

I tend to focus first on information and authority flow, because if those cannot be sufficiently constrained the other issues pretty well seem to go out the window. Let's walk before we run.

The 'walk before we run' philosophy leads us to design crutches that will help us walk, but will hinder us when we later attempt to run. This would be okay if it was easy to later discard the crutch. Unfortunately, in the design or development of platforms, it is often far more difficult to remove features/crutches than it is to add them.

Besides, the entire existing system is 'walking'. We already know how to walk, albeit with a multitude of crutches (anti-virus, firewalls, paranoia), limping performance, and occasional stumble. But since we have never walked without these crutches, most people don't even imagine it to be possible. Motivation to discard the crutches won't ever come from 'learning to walk'.

We must learn to run before we'll ever know what it means to truly walk.

In all of the oCap systems I have seen that try to deal with this, the sources of resource are explicitly named and therefore are not ambient.

My words were "object capability languages often retain ambient authority for various denial-of-service attacks". If you need examples of such languages, consider E or Joe-E. Pointing out that some ocap systems have attempted to deal with this doesn't mean they fully succeeded, especially not in a manner that retains other relevant safety properties.

Protection against DOS is a pervasive issue, much like error handling or persistence or concurrency control or real-time programming. Pervasive issues are ideal targets for standardization and possibly even some language support.

we first agree on the axioms from which our policies are formed. Only then can we attempt to reconcile the policies. This is true whether the system is monolithic or distributed.

I agree.

the base case will be called the trusted computing base by definition

The definition I have for TCB refers to hardware, firmware, and software that must be trusted for the security of the system. Can you point me to any particular unit of hardware, firmware, or software whose compromise can affect the security of an entire distributed system?

require some form of axiomatic and predictable behavior. It is the axiomatic and predictable nature of the behavior that constitutes the "trusted" part of the "trusted computing base".

I agree. That leaves you to find the "computing base" part of "trusted computing base". ;-)

I rely upon axiomatic and predictable behavior associated with the communications - the language and protocol. For example, I rely upon the fact that it is cryptographically difficult to guess large random numbers. I also rely upon the idea that users and regimes will attend to their own interests to the extent those interests are made clear to them.

you seem to be confusing administrators with administrative regimes. The two are distinct, as my earlier comment about meta-administration suggests.

I grant the comment about 'meta-administration' is valid to the extent that expression of enforceable administrative policy is constrained.

The 'seeming' of confusion is because I do not consider the distinction to be especially relevant. Whether you speak of administrators or administrative regimes or restrictive meta-administration, you are speaking of a common and privileged set of users whose policies or interests are protected by a TCB.

Second, the story was at least as concerned with protecting users from administrators.

How so?

it is obviously not possible for a computing system to enforce rules on interactions that are conducted outside of that system

The job of a secure computing system is to support diverse users utilizing their authorities, expressing their interests, and enforcing their own rules. What I mean by "directly between users" is simply "without interference from someone's common administrative regime". Under this philosophy, administrators are users. And there is no 'common user regime'.

What you are characterizing as a "monolithic administrative regime" is the existence of an enforceable, shared definition of axioms.

What I characterize as a "monolithic administrative regime" is the existence of an enforceable, 'common' security policy to which less-privileged 'users' must adhere. Centralization is the basis for 'monolithic' structure, whether it be of communications or policy.

A system of axioms does not necessarily constrain expression of an enforceable policy. Axioms will affect how - and how efficiently - a given security policy might be expressed and enforced, so you might reasonably point to a meta-administrative aspect: the language designer gets to choose which sorts of security policies are most efficiently expressed or enforced.

But even if you shun a Turing complete policy-description language, and thus constrain absolute expressiveness of the security policy, the limits to expression may be out there on the edge of irrelevant. There doesn't seem a reasonable basis to assert a 'common administrative regime' exists if there isn't an enforced administrator policy you can point to.

there is code that, in order to perform its function, requires certain guarantees concerning privacy from the host platform. No distributed application instance can sensibly hold a decryption key if the local host or its users can snoop the key. That privacy guarantee cannot exist without a remotely auditable TCB.

I am curious. Can you name an example of such code?

I embrace the fact that ocap languages prevent you, syntactically, from even attempting to express impossible security requests. This seems like one of them. Even if someone did give you a reference to audit a 'remote TCB', I seriously doubt you can ever ensure that the TCB you are auditing is the same one upon which your code is running.

It seems to me that 'local privacy' is a trust issue. I've already described one way to tackle trust issues.

Mark Miller and I both prefer to speak in terms of reliances rather than trust. The term "trust" spills over into social intuitions that aren't always relevant, and this often leads to hidden or mistaken assumptions.

I use 'trust' with its full social connotations, including potential for betrayal.

I have seen humans enter keystrokes, touches, and even toggle switches. I have seen programs, in response to these actions, invoke authorities. I have never once, in the entirety of my career, observed a human to invoke a computational authority.

What is so fundamentally different between a human toggling switches vs. a CPU tweaking transistors, that the latter invokes a computational authority but the former does not?

Does it have something to do with size? :-)

in both cases the success has devolved from decreasing the reliance on information flow restrictions

Indeed. And that's a fine example to follow for developing applications and program extensions in the future.

For in-the-large software security, it is critical that the entire software life-cycle - development, integration, distribution, upgrade, extension, etc. - be brought into the model. I am convinced that doing so requires reducing reliance on 'confinement' (or 'isolation')-based information flow restriction.

Web-apps and such have been teaching us how to walk this path. Without better programming models, however, we'll never be able to 'run'... supporting secure mashups, or 'web-apps' with full video-game performance, or zoomable user interfaces to the web.

The same reasoning that lead you to earlier discuss 'meta-administration', you could assert that every 'application' represents the administration and meta-administration supported by its developers. Thus, what we really have - even today - are hundreds of distinct administrative domains overlapping and interacting and dissolving as swiftly as business relationships and web connections. That view becomes especially useful to the extent that 'applications' can interact with external services or one another, or are subject to remote monitoring or upgrade.

My basic plan to achieve security in that mess is:

  1. eliminate ambient authority, such that running code you don't trust at least won't cause you any harm. This covers the 'safety' requirement of security.
  2. enable practical fine-grained code distribution at the sub-application level, aka tierless programming, without delivering security-sensitive content to untrusted remote hosts. This covers the 'liveness' requirement of security. It enables critical services to keep running in face of disruption or network partitioning.

Each of these steps enable of multitude of optimizations and better support systems-level garbage collection.

I'm enjoying the exchange, and I hope that you are as well.

Good to hear.

Encapsulation vs. Confinement in App Models

Under the assumption that developers can influence direction of code/data distribution, my hypothesis is that "encapsulation" is sufficient to control flow of sensitive information, and that confinement - while useful for some assertions - is rarely necessary.

To challenge my hypothesis, assume we want to do some symbolic math. It would be entirely feasible to tackle such a problem by use of a math library, likely even a 'pure' math library, which could certainly be confined. However, in-the-large someone must develop this library, maintain it. Optimizations and fixes will be added. So, another option is to grab a capability to a remote symbolic math service, such as Wolfram.com, and drop that into your project. Doing so shrinks your project a bit and leaves maintenance to the guys providing the service. However, there are also many problems with a 'remote' service: latency, disruption, and even security issues because you'd be sending 'sensitive' numbers and other data over to people at Wolfram.

But, if we can influence direction of code-distribution - i.e. by marking certain code or data 'sensitive' such that it isn't distributed - we might have instead have the service replicate itself locally. In this manner, the remote service has authority to maintain its own code (those updates would be mirrored locally), but it cannot steal sensitive information or authorities that are 'encapsulated' to a local host.

A math lib is something of a 'worst case' for this sort of confinement question, and there are other reasons one might choose a confined library (such as statically proving correctness, as with Coq). A majority of applications or services cannot be readily confined or 'proven correct'. Within the 'service' based approach it is easier to specify security policy for interacting with a service than it is for "instantiating" a new one. (Instantiating a new service often requires you provide a lot of low-level networking capabilities and such. Interacting with existing services is simpler.)

Don't agree

In a partitioned capability system, confinement allows utterly hostile subsystems to be absolutely prevented from leaking or disclosing authority, and also from wielding authority whose presence is unknown to the user's client application.

Sensitivity marking schemes are inherently discretionary. If you can rely on the target system to honor the markings, you didn't need them in the first place.

Confinement is a crutch.

confinement allows utterly hostile subsystems to be absolutely prevented from leaking or disclosing authority, and also from wielding authority whose presence is unknown to the user's client application

By shunning confinement, you may expect and require those "utterly hostile" subsystems to provide their own authorities to almost everything of interest. They hold their own remote and unum authorities, tied to their own responsibility. This frees you significantly: you can limit your interaction to just a few high-level capabilities, which greatly simplifies reasoning about security policies. In practice, much authority is effectively passed in the other direction: the potentially hostile subsystem is offering limited authority to you, and you get to play the bad guy. Thus, patterns are based around a certain degree of mutual distrust.

By favoring confinement, you become responsible for all authorities granted to potentially hostile code while "instantiating" (aka "installing") it, in addition to the same high-level capabilities for interacting with it. Wielding those extra authorities for instantiation is highly subject to error, especially as the application code grows more expressive or 'interesting' and it becomes difficult to choose a 'least authority'. Further, since those authorities being used to instantiate the hostile code are your own, they are your responsibility and are thus subject to abuse. Yet it is far too difficult to 'audit' low-level capabilities against abuses.

Thus, in-the-large confinement subjects application users to a great deal more liability and potential for security error. It is a crutch. It hurts expressivity, requires you to provide extra authority to hostile code, and fails to account for the full maintenance life-cycle of the software you are instantiating/installing.

In-the-small, confinement is perfectly reasonable. One can use confinement to build blocks of code, especially when there is no need to express 'effects'. However, in-the-small there is no reason to expect any hostile code. You can often eyeball such code, or simply examine its type signature, to prove against hostile application of authority.

Sensitivity marking schemes are inherently discretionary. If you can rely on the target system to honor the markings, you didn't need them in the first place.

You make erroneous assumptions:

  1. you assume that there is a single 'target system' for a given block of code
  2. you assume that, without those markings, a 'target system' would somehow magically know which code/data is sensitive and the extent to which it may be further distributed

It is true that sensitivity markings are discretionary. These markings simply constrain distribution of code ('code' including data and authority). Code won't ever be sent to a host that isn't expected to honor it. Indeed, that's exactly what each sensitivity marking says: which characteristics (authorities or certificates) a system must prove to already possess in order to be entrusted with the sensitive code and expected to honor the markings. E.g. one might assert: "only systems with a Secret DOD clearance cert are allowed to see this authority, or the inner workings of that code".

Those systems are entirely free to betray your markings. If they do, you can't do anything about it - except to stop trusting whomever certified them in the future, and to tell other people to do the same. It is in your best interests if you limit your trust to the extent that liability exists.

Markings on the code to make your expectations clear. It is impossible to honor your markings if you don't provide them! Well... except by the most inefficient and inexpressive code-distribution mechanism possible: hoarding all the code then depending upon short-sighted crutches like 'confinement'. By providing those markings to a 'target system' that is likely to honor them, that system gains the information necessary to make correct decisions about code-distribution, i.e. to decide whether it would be okay to send code/data to wolfram.com to compute remotely (perhaps taking advantage of a cloud service) vs. request wolfram.com to send enough code/data to run the service locally.

Don't agree

Your examples of shunned confinement all seem to involve remote services. From the client perspecitve, such services are inherently unconfined, so I don't see what there is to shun. In any case, compromises arising from such services don't leave your machine broken.

Your assumption that capabilities granted to a confined subsystem are fine-grained doesn't match experience. The so-called "power box" patterns work exceptionally well, and do not involve excessive user decision making.

Finally, I think you're missing the evidence of the cloud, which is that computation and data want to be co-located for reasons of bandwidth. This changes the party who is concerned about confinement, but it doesn't change the requirement.

In any case, this is a point on which we are deeply not going to agree. Your "solution" is predicated on a priori trust, which hasn't worked very well over the last 60 years. My approach is based on the notion of "suspicious collaborators", which seems to work better in real practice.

Suspicious Collaborators also Shun Confinement

Your examples of shunned confinement all seem to involve remote services. From the client perspecitve, such services are inherently unconfined

My example of 'shunned confinement' used remote services in cases where a confined implementation also is possible, as with a 'confined math library' vs. 'remote wolfram.com math service'.

In any case, compromises arising from such services don't leave your machine broken.

That is untrue. Compromises arising from interactions with remote services have much opportunity to leave machines broken... quite literally, in the case of unmanned systems or command and control systems. POLA is about controlling the extent of said compromise.

Your assumption that capabilities granted to a confined subsystem are fine-grained doesn't match experience.

I do not assume this.

When instantiating/installing a confined subsystem, you'll tend to grant low-level (and, therefore, inherently coarse-grained) authorities. For example, you'll often offer access to a whole network or nothing at all. (Fine-grained authorities, by comparison, would often be down to templated messages to a specific remote service.)

When I stated it can be difficult to assign a 'least authority', I meant even at the coarse-grained level. For example, it wouldn't seem like a Solitaire game needs access to the network, but maybe someone adds a 'leaderboards' feature to compare your scores against others, or perhaps enhances the application to support multi-player Hearts, or perhaps the Solitaire game is licensed on a pay-per-use policy. The people designing the applications are the ones who choose which features its consumers will need. (That Trojan horse is awfully neat; let's bring it inside!)

The so-called "power box" patterns work exceptionally well, and do not involve excessive user decision making.

The 'power box' pattern can work, but 'exceptionally well' has never been tested on the field (at scale enough to attract attackers). My concern lies with scaling to a large number and variety of applications, especially including the extensible ones, and the issue of teaching habits to users. You may review my thoughts on the subject.

I think you're missing the evidence of the cloud, which is that computation and data want to be co-located for reasons of bandwidth.

I explicitly (several times, in fact) do not make a distinction between 'code' and 'data'. Thus, you speak about "colocation of computation and code" to me. I do assume that computation occurs at the code's location.

Performance, disruption tolerance, redundancy, and security each have something to say about how code should be distributed. For this thread I am focused on the security constraints of code distribution. If security's safety constraints can be satisfied while leaving some freedom for how code is distributed, then the other three properties get their say.

Your "solution" is predicated on a priori trust.

No, it isn't. Trust is something that my solution will leverage when possible. Other than correctness proofs or 'validation' of application code, nothing you can express in a confinement-based approach will rely upon trust in the shunned-confinement solution. (I'm not against confinement for code small enough to prove correct, at least if that proof frees it from the maintenance life-cycle.)

To the extent trust exists, I can reduce security-based safety constraints upon code-distribution and thus allow concerns for performance, disruption tolerance, and redundancy some extra opportunity to influence code-distribution. (Aside: I do not require any temporal dependencies about whether trust is 'a priori': I would happily take 'ex post facto' advantage of trust relationships that later develop, by redistributing long-lived code.)

Trust can sometimes enhance absolute system expressivity. For example, a mutually trusted third party can sometimes host a sensitive interaction between suspicious collaborators that would otherwise be unable to interact. Mark Miller and Marc Stiegler discuss this property in The Digital Path. However, such cases are so far beyond the expressiveness achievable in confinement-based solutions that you are certainly in no position to complain about them.

My approach is based on the notion of "suspicious collaborators"

No, it isn't. Your approach is based on confinement: imprison one 'collaborator' while the other holds all the authority. The imprisoned component offers limited forms of advice, which the jailer may then execute on his own authority (despite being ignorant of imprisoned component's larger designs). One might reasonably question whether this qualifies as "collaboration" at all.

If you want an approach to mutually "suspicious collaborators" that protects both collaborators, you must shun confinement.

Code == Data, but Code => Behavior

I explicitly (several times, in fact) do not make a distinction between 'code' and 'data'. Thus, you speak about "colocation of computation and code" to me.

Ignoring the security implications of writeable code, I concur that code is just data. But the execution of code is behavior, and data doesn't get executed. Behavior is the source of security compromise, so one must view code and data quite differently.

But I spoke about the localization of computation to data. The reason to ship code is driven by primarily by latency issues. The problem with shipping code is that one cannot (in general) know what it's behavior will be.

Let's make this concrete. We ship code all the time. Consider ActiveX. How do you propose that the actions of an ActiveX control should be appropriately restricted? I submit that you will find yourself re-inventing confinement.

If you want an approach to mutually "suspicious collaborators" that protects both collaborators, you must shun confinement.

If there is a way to achieve that, I have never seen it. Certainly you have offered no solution above. Can you explain, concretely, how a suspicious collaborator can obtain robust guarantees of anything in your scheme?

Interpretation of data is behavior.

execution of code is behavior, and data doesn't get executed. Behavior is the source of security compromise, so one must view code and data quite differently.

Interpretation of data is behavior. Data is 'executed' whenever a decision is made based upon it. All programs acting on the content of their input are interpreters, with the input yet another program. Behavior is the source of security compromise, so one must view code and data the same.

[Edit] Further, sensitive authorities and sensitive information need equal protections. Asserting that a given solution cannot leak 'capability bits' is important, but it is similarly important to avoid leaking 'sensitive information bits'. The ability of an attacker to obtain sensitive information and take action upon it is a real security risk. In that sense, capabilities and data are not especially distinct.

Between these, I do not see a sufficient justification for distinguishing code, data, and capability. This opinion is reinforced by the fact that every historical attempt to separate code from data seems to have failed, generally resulting in the introduction of scripting languages (JavaScript, ActiveX, VBScript, SQL Procedures and Triggers, etc.). This opinion is further reinforced by the need, in general, to distribute code to protect my authorized interests across transport-layer network disruptions.

The assertion that code and data are meaningfully distinct is a fallacy. Everything you build upon that distinction will be flawed.

Ignoring the security implications of writeable code [...]

Do you refer to 'writeable code' in unsafe languages or those with ambient authority? Or do you refer to the security implications of mutation in general?

Even someone who isn't willing to embrace that 'data' really is 'code' should recognize that first-class functions, in combination with either garbage-collection or the ability to assign said functions to a mutable variable, already implies in-the-large mutation of code.

The reason to ship code is driven by primarily by latency issues.

There are many reasons to ship code. Latency is one of them. System utilization - e.g. making users share the compute burdens, or parallelization of big problem - is another. Redundancy is a third reason: code replicated to multiple locations is much more likely to survive a node failure. Disruption tolerance in a faulty network is a fourth, and has positive security aspects (liveness). And yet another reason is to meet security information-flow requirements; i.e. one doesn't wish to send sensitive content across the network to be processed on an untrusted machine, so there is much 'demand' for the remote code be shipped instead to the guy holding the sensitive content (often in the form of a 'library' or 'application').

I do not know whether latency is 'the primary' reason to ship code. But I do think it important to understand the different reasons and the occasional conflicts between them.

The problem with shipping code is that one cannot (in general) know what it's behavior will be.

Aye, that is sometimes a problem. Same could be said of shipping that vague class of code you call 'data', since you don't always know how it will be interpreted. So the relevant question is: under which conditions is this a security problem?

Let's make this concrete. We ship code all the time.

Indeed. That is especially true once you accept that all 'data' is also code.

Consider ActiveX. How do you propose that the actions of an ActiveX control should be appropriately restricted? I submit that you will find yourself re-inventing confinement.

I would strip ActiveX of ambient authority. I would allow the code to carry unforgeable designations for both remote and unum authorities. (Thus, authorities may leak to the recipient of the ActiveX code.) A 'remote' authority might, perhaps not coincidentally, even be hosted at the recipient, and thus have low-latency access. I would then need to change the name, because the language in question is no longer ActiveX. At this point, I'd still need to integrate concurrency, safe and denial-of-service resistant concurrency control, and resource management... all of which would likely involve a massive overhaul of ActiveX's fundamental model of computation.

I have never advocated that we utilize an arbitrarily selected language as the basis for code distribution. There are a number of practical requirements regarding the analytic properties of the code being shipped about. I said earlier: Risk for Charlie from Alice is associated primarily with ambient authority. Charlie can feel safe about running Alice's code when Charlie can guarantee that running Alice's code on his local machine won't cause him any more harm than Alice could cause by running the same code remotely. The language for code-distribution must be one that Charlie will be comfortable executing after a cheap and cursory examination, even when he doesn't trust its source. (The generalization of this feature has been called 'proof carrying code'. It is much easier to achieve in the specific case of a limited class of properties supported by a high-level distribution language.)

[Edit] Though, it occurs to me that you and I may have different definitions for 'confinement'. I do not prohibit distributed code from containing or exercising remote and unum authorities, hence I assert that I do not require 'confinement' as under Lampson's definition. Lampson's conclusions were made under the assumption that application/process code is internally inscrutable. I reject this assumption by requiring that code-distribution use higher-level code for which some useful properties can be verified. 'Mutually-Distrustful Computing Bases' can use marks of sensitivity and contagion rules to constrain which code (and data, and authorities) are distributed to which distrusted computing base. This, together with support for effect-free functions or values where necessary, can achieve security goals without depending upon in-the-large confinement of applications or other code-distribution packaging.

Can you explain, concretely, how a suspicious collaborator can obtain robust guarantees of anything in your scheme?

Regarding which specific subsets of 'anything' are you most interested? I've already offered a lot of specific reasoning, including the basic questions of mutually suspicious Alice and Charlie and the math lib vs. math service example.

Non-zero?

If someone presents a bogus pattern of bits calling it a capability, isn't there a nonzero chance that the bit pattern will be the same as a bit pattern that someone would present to represent a legitimate capability, and therefore be accepted by the OS?

Say keys are 128 bits, and there are 2^32 capabilities in the system (a ridiculously large number for a single mahcine). In this case, indeed, there is a non-zero chance that a random 128 bits will be a valid capability. The chance is 2^-96. If you write a program that randomly guesses a million capabilities a second, it would be expected to find a valid capability in about a quadrillion years. If that is not good enough for you, use a 256-bit key, in which case we're looking at 10^54 years.

Needless to say, it's not a vulnerability worth worrying about.

By the way, this is why it's safe to assume that if two files have the same SHA-256 hash, they're actually the same file -- an extremely useful property.

Type-based protection

And of course, if capabilities are protected by typing or by partitioning, it doesn't matter whether you guess the right bits.

A lot of the safe language intuitions are relevant here. You can't just "make up" a pointer.

Some background reading

Can I make a few well-intentioned suggestions? There's lots of background material on capabilities, which I think might help. Here are my favorite recommendations:

Ben Laurie's introduction: http://www.links.org/files/capabilities.pdf
Mark Miller's thesis: http://www.erights.org/talks/thesis/
My survey talk: http://www.youtube.com/watch?v=EGX2I31OhBE

Also, I suggest asking these questions on the cap-talk mailing list: http://www.eros-os.org/mailman/listinfo/cap-talk There are many folks there who can help with these questions.

Now, I don't mean to foist you off without any answers. You seem to have roughly the idea, but there are a few premises/assumptions that I think might not be quite right. There are many ways to implement object capabilities; one is via the OS, and another is via the language. Since you ask about the OS, I'll focus on that angle.

In the OS approach, there are multiple ways to represent caps. One way is that a capability is an unguessable 128-bit string; the OS has a mapping of "128-bit string (i.e., cap)" to "resource designated by that cap", but doesn't know who owns any particular cap. When anyone presents a valid 128-bit string to the OS, the OS lets them access that resource.

Another OS approach is to represent caps the same way that Unix represents file descriptors: the OS maintains a per-process data structure that maps small integers to resources. A process can identify the cap it wants to use by passing a small integer to the OS. The OS provides APIs to allow a process to communicate any cap it has to any other process that it has a cap to.

You ask how caps persist across a reboot. There are multiple answers here, too. One instance is that caps don't persist across reboots and processes somehow have to re-obtain all necessary caps each time they start. (Obviously, this has some severe disadvantages.) Another model is that processes are persistent: when you reboot, the same set of processes come back to life, thanks to the magic of checkpoint and restore. (When a process comes back to life, it is in the same state as it was before, including having the same set of caps.) See EROS for an example of the latter model.

There are many other ways to do it. These are just some examples.

Finally, you ask about revocation. The standard way to support revocation is that if Alice wants to give Bob a capability C that she might later want to revoke, she doesn't give Bob the capability C itself. Instead, she constructs a proxy object that wraps C. All invocations to the proxy object are relayed to C -- except that the proxy has a single bit of state, to indicate whether it has been revoked. Initially, the proxy is in the unrevoked state. At any point, Alice can cause it to transition (irrevocably) to the revoked state. When the proxy object is in the revoked state, then it no longer forwards invocations to C. Now Alice can pass Bob a reference to the proxy object (instead of C). This is known as the Caretaker pattern. See, e.g., Mark Miller's thesis.

Since I'm mentioning many different ways to build a capability architecture, I should also mention that they have slightly different properties and slightly different advantages/disadvantages. Once you understand how they work, you can probably reason out what the advantages/disadvantages are.

There's lots more to be said. If you have more questions, it's worth asking on cap-talk: you'll find many helpful folks there.

Definition not correct.

In the OS approach, there are multiple ways to represent caps. One way is that a capability is an unguessable 128-bit string; the OS has a mapping of "128-bit string (i.e., cap)" to "resource designated by that cap", but doesn't know who owns any particular cap. When anyone presents a valid 128-bit string to the OS, the OS lets them access that resource.

David, I'm surprised at you! What you describe here is not a capability system, because the capabilities are unprotected and therefore forgeable. There are systems of this sort that have been described as capability systems in the literature. They fail every security test one can think to throw at them, including failing the confinement test.

In an OS capability system, what the application holds is an index. The capability itself resides in data structures protected by the operating system and inaccessible to the application. In consequence, the guessability of the capability is irrelevant. This is true in the same way that guessing the bit representation of an object reference isn't enough to let you fabricate one in a safe PL.

Thank you so much for the reading list!

Ben Laurie's introduction: http://www.links.org/files/capabilities.pdf

Nice manager's brief and validates much of my thinking on caps at least as one valid strategy among several (including having the OS manage them so that they can remain secure without being secret, and track "ownership" of each so that a process attempting to use one fails unless the OS has seen that cap get specifically delegated to that process) but light on implementation strategies.

Mark Miller's thesis: http://www.erights.org/talks/thesis/

Ah. Here's the unadulterated how-to of the pure ocap pattern, which seems to be the most popular of the several strategies. Good. This gives me a better understanding of why so many claims have been made -- very limited forms of most of them are supported.

My survey talk: http://www.youtube.com/watch?v=EGX2I31OhBE

Sadly, there is no 64-bit Flash player that is free-as-in-libre yet; I shall continue to forego youtube until this is remedied.

[...] and track "ownership"

[...] and track "ownership" of each so that a process attempting to use one fails unless the OS has seen that cap get specifically delegated to that process

Hmm, I didn't see mention of this in that document. What section are you referring to?

This gives me a better understanding of why so many claims have been made -- very limited forms of most of them are supported.

Do you mean limited forms of the claims being made about capabilities are actually feasible? Can you identify these claims?

Section 3.6, third

Section 3.6, third paragraph:

However, it is sometimes desirable to authenticate the wielder of the capability, rather than relying on mere possession. There are two reasons you might want to do this
• To restrict delegation of capabilities.
• To avoid having to keep capabilities secret, where secrecy is a necessary prerequisite to unforgeability.

I had not envisioned the OS restricting the delegation of capabilities. That seems like a futile quest to me. Unforgeability of capabilities without a secrecy requirement was my rationale.

In the scheme I had in mind (and still consider superior to the "pure" ocap model where identity is not tracked but secrets are required) The OS keeps track of which processes possess what caps. No secrecy is required because even if a process learns the bit pattern of a cap it does not possess, it cannot get the OS to do anything with it because the OS knows that it does not possess this cap.

A process creates a capability by requesting one from the OS given one or more capabilities to derive it from and a code pointer to a closure that checks any further restrictions it wants to place on it. The OS creates the desired capability, and a corresponding revocation cap, and optionally a corresponding cap to read a stream that monitors usages of that cap and the transitive closure of its delegees, and designates the creating process their "owner" (unfortunate choice of words on my part -- more conventional usage in capabilities jargon would be "sole current possessor"). In practice, the monitor and revocation caps might be made available by the OS to root or root's designee (the sysadmin) too.

Any possessor of a cap may designate new possessors just by sending them the bit pattern and notifying the OS who they are so that when they present that bit pattern the OS can honor it as a cap. Anyone who possesses a cap may use it by presenting it along with its request to the OS. The OS verifies using its internal tables that the presenting process is in fact a possessor of that cap and that the function provided when the cap was created verifies this usage as valid, and if both tests pass it grants the usage.

The typical pattern would be to pass along or distribute a created cap and retain the monitor and revocation caps.

The OS, having a complete picture of the capabilities, knows which capability was derived from what, and can invalidate the appropriate derived capabilities any time one of their roots is revoked.

Do you mean limited forms of the claims being made about capabilities are actually feasible? Can you identify these claims?

The most frequent claim involves pervasive, automatic POLA limiting of programs. This is an admirable goal, and yes, you could technically implement it with capabilities, provided users could be bothered to actually sit down and understand, identify, and approve each and every capability their programs actually needed. But they can't. Instead they will demand (and get) a shortcut that just gives every program they run every capability it wants. They want to not have to think about it and work on it even more than they want security.

The ocap model, which I hadn't previously considered, accounts for a claim that user identity need not be tracked -- because there's an implementation of capabilities where users actually don't have to be tracked. But in this model, you have to have secrecy in order to obtain unforgeability, which makes some restrictions on the system which I really consider untenable. Also when you have a malefactor or someone passing capabilities on to malefactors, you can stop them eventually using the Horton protocol, but you cannot discover who they are and hold them personally accountable, which I consider essential.

Another claim involves fine-grained security based on any discriminating characteristic about users -- usually heard w/r/t underage users accessing "adult" websites, etc. Capability security can help with this, provided the discriminating characteristics are actually available in some checkable form for it, which is not too much of a stretch.

Another claim involves locking spammers out of your email account while keeping your actual friends able to contact you. Capability security could help with this (via revocability) but we'd need to replace SMTP first, meaning we'd have to redefine what "email" means. In practice I don't think it's likely.

(and still consider superior

(and still consider superior to the "pure" ocap model where identity is not tracked but secrets are required) [...] But in this model, you have to have secrecy in order to obtain unforgeability, which makes some restrictions on the system which I really consider untenable.

I'm not sure what secrets you're referring to. The ocap system OS shap and I described does not rely on secrets, it relies on partitioning data and capabilities, such that you can only designate capabilities via a descriptor, not a pointer. Only distributed systems require secrets, and that will likely be the case for any distributed security model you'd care to implement.

Any possessor of a cap may designate new possessors just by sending them the bit pattern [1] and notifying the OS [2] who they are so that when they present that bit pattern the OS can honor it as a cap. Anyone who possesses a cap may use it by presenting it along with its request to the OS. The OS verifies using its internal tables [3] that the presenting process is in fact a possessor of that cap and that the function provided when the cap was created verifies this usage as valid [4], and if both tests pass it grants the usage.

1. This almost sounds like a Password Capability system.
2. Sounds vulnerable to TOCTTOU problems . Sending and registering the delegation must be a single atomic step to avoid such problems.
3. So far this sounds pretty much like ocaps or password capabilities, which are closely related. You haven't explained how identity factors into this, but presumably you track every delegation and stamp it with the identities and stored in the tables. If so, you've just reinvented the Horton protocol.
4. I'm not sure how this is intended to operate in an OS. Where does this function reside, in user space or kernel space? Is the function completely arbitrary?

The most frequent claim involves pervasive, automatic POLA limiting of programs. This is an admirable goal, and yes, you could technically implement it with capabilities, provided users could be bothered to actually sit down and understand, identify, and approve each and every capability their programs actually needed. But they can't.

This is a prevalent myth. See CapDesk, Polaris and Plash for how usable least-authority systems work. The fundamental principle to the usability is the unification of designation and authorization, which is exactly a capability.

you can stop them eventually using the Horton protocol, but you cannot discover who they are and hold them personally accountable, which I consider essential.

"Who" is a very nebulous concept in computers, as Marc pointed out above. Generally you're only interested in "who" at the highest level of a system where it interacts with users, which is a perfect spot to inject Horton. The bowels of a software system have very little use for "who" is doing something.

Another claim involves fine-grained security based on any discriminating characteristic about users -- usually heard w/r/t underage users accessing "adult" websites, etc. Capability security can help with this, provided the discriminating characteristics are actually available in some checkable form for it, which is not too much of a stretch.

That's the first I've heard that particular claim! Capability proponents generally acknowledge that confinement of bits is almost impossible, particularly when considering covert channels. I can see how something like the "adult web" filtering could work with capabilities, but it would involve overhauling significant Internet infrastructure. Certainly not a trivial undertaking.

Another claim involves locking spammers out of your email account while keeping your actual friends able to contact you. Capability security could help with this (via revocability) but we'd need to replace SMTP first, meaning we'd have to redefine what "email" means. In practice I don't think it's likely.

Not at all. Adapting Waterken-style web-keys to work over SMTP is actually pretty straightforward. Every delegation just creates a new unguessable e-mail address, or alternately includes an unguessable component that must be combined with the human readable component, ie. naasking+a838dhJnbdd7h@host.com. Everything in the e-mail protocol stays the same, except for an extension to the delegation protocol, ie. create new unguessable address from address X, and the addition of a revocation protocol, ie. revoke address X.

Not quite least authority

See CapDesk, Polaris and Plash for how usable least-authority systems work.

The "least authority" design rule is simply wrong, and always has been. It arose as a counter-proposition to then-existing systems in which processes had (in practice) nearly universal authority.

Practical systems must strike a balance between minimizing authority on the one hand, and not dividing it into such fine increments that humans cease to be able to manage the complexity. In practical terms, there is such a thing as having unmanageably minimized authority.

What does seem to be true is that starting from the "no initial authority, then add what you actually need" has worked in two senses:

  • It tends to effectively combat excessive authority, and
  • By doing so, it has allowed capability-based designers to investigate the human-manageable tradeoffs, patterns, presentation, and idioms of authority.

To the extent that they are actually usable, CapDesk, Polaris, and Plash succeed by making good balancing choices. In the process, they stepped back a bit from the brink of truly minimized authority.

Some disagreement on a few points

The most frequent claim involves pervasive, automatic POLA limiting of programs. This is an admirable goal, and yes, you could technically implement it with capabilities, provided users could be bothered to actually sit down and understand, identify, and approve each and every capability their programs actually needed. But they can't.

See CapDesk for a counterexample (or at least a partial counterexample) which illustrates that the argument above isn't quite correct: users don't have to sit down and understand, identify, and approve each and every capability their programs actually need.

The ocap model, which I hadn't previously considered, accounts for a claim that user identity need not be tracked -- because there's an implementation of capabilities where users actually don't have to be tracked. But in this model, you have to have secrecy in order to obtain unforgeability

Hmm. This does not sound right to me; I believe there are implementations of the ocap model that do not rely upon secrecy and do not rely upon tracking user identity. See, e.g., object capability languages.

Also when you have a malefactor or someone passing capabilities on to malefactors, you can stop them eventually using the Horton protocol, but you cannot discover who they are and hold them personally accountable, which I consider essential.

Personally, I don't see this as a compelling criticism. Thanks to covert channels and proxying, you can't track down the end malefactor in any system, whether ocap, ACLs, or something else. In ocap systems (and other systems), you can track down one of the users in the chain, though this may not have been the user you wanted to track down.

The Enforcement Myth

Thanks to covert channels and proxying, you can't track down the end malefactor in any system, whether ocap, ACLs, or something else.

There is an important point hiding in this statement. Those of us in the oCap community have long since adopted the KISS principle. As part of that, we have abandoned attempting to enforce policies that demonstrably cannot be enforced. There are many polices people want that fall into the "cannot be enforced" category. People don't like that, but truth does not respect opinion.

Slides for my talk

OK; for my survey talk, I don't have any other video to share with you, but here are annotated slides from an earlier version of that talk: http://www.cs.berkeley.edu/~daw/talks/TRUST07.pdf

It's not the best version of the talk, but if you prefer to avoid Youtube, it's the best I have to offer. Sorry I don't have anything better.

Secrets

I'm not sure what secrets you're referring to. The ocap system OS shap and I described does not rely on secrets, it relies on partitioning data and capabilities, such that you can only designate capabilities via a descriptor, not a pointer.

When you talk about partitioning data and capabilities using paging hardware and I talk about keeping secrets, we're talking about the same thing. You have something that can't be exposed to user space - ie, you have a secret.

In my universe, secrets have to be accessible somehow. The machine has an owner or root authority, and the owner has to be able to know, with certainty, what capabilities exist on their system, audit them, revoke them as necessary, verify the NON-existence of certain caps, etc. It is not acceptable to have them outside the root authority's control, so they might as well be part of kernel memory and accessible to root through the OS.

Perhaps my distinctions between secret and non-secret are a bit blunt-edged, but experience teaches me to distrust secrets and the ability of machines to keep them. There are simply too many low-level tools available to attackers, and too many well-meaning fools who will want to "extend" things in various ways.

Conversely, if we create hardware that's not amenable to the use of those low-level tools, we create the possibility of locally-undetectable remote root capabilities. I don't want some stranger in Redmond Washington, Richmond Virginia, or Hollywood California to "own" the box I bought and paid for, and right now that seems the most likely outcome of combining hardware-enforceable secrets with network access.

Capabilities, especially long-lived capabilities, are too valuable to be completely hidden or completely available. Hence, I think they belong in kernel space.

"Who" is a very nebulous concept in computers, as Marc pointed out above. Generally you're only interested in "who" at the highest level of a system where it interacts with users, which is a perfect spot to inject Horton. The bowels of a software system have very little use for "who" is doing something.

It's not nebulous at all if you look at the security requirements. And the requirements are straightforward. If a cap that should not exist arises on your system you must have the ability to detect the cap's existence, revoke the cap, determine who (which user/s) broke your security, and what (which program/s) enabled them to do so, whether accidentally or purposefully. Even if it's a case of programs functioning as designed, the design may be inconsistent with the local security requirements and you won't necessarily know it until you see the assembled cap that results.

So the obvious units of dynamic creation, delegation, and use of capabilities are process ID's (so you can tell which programs enabled the breach). The equally obvious units of persistent storage of capabilities are user accounts (so you can tell which users' stored authorities were used to create the cap). And the requirement that root can detect and revoke caps implies that the capabilities must be explicitly maintained in a space readable by and in a format searchable by OS utilities available to root. This is a very straightforward inference from the requirements. If there's some subtlety here, I'm missing it.

When you talk about

When you talk about partitioning data and capabilities using paging hardware and I talk about keeping secrets, we're talking about the same thing. You have something that can't be exposed to user space - ie, you have a secret.

This is not even remotely the same thing. Just because the bits of a Java reference are inaccessible to Java programs doesn't make said bits a "secret" in any meaningful sense, and it certainly doesn't lead to insecurity or a lack of extensibility.

Furthermore, as shap has said further up in this thread, you can't turn bits into capabilities anyway, so even if you could access the bits of the capability it wouldn't do you any good. Capabilities are protected, opaque, like memory safe references, not secret. No security properties of an ocap system are dependent on the secrecy of the bits in a capability.

I don't want some stranger in Redmond Washington, Richmond Virginia, or Hollywood California to "own" the box I bought and paid for, and right now that seems the most likely outcome of combining hardware-enforceable secrets with network access.

What OS are you going to write that doesn't fundamentally depend on hardware protection? Unless it's a safe language based OS, in which case a trusted runtime that protects references replaces your trusted kernel that protects capabilities. There is no semantic difference between these two alternatives.

Capabilities, especially long-lived capabilities, are too valuable to be completely hidden or completely available. Hence, I think they belong in kernel space.

One of us confused. Ocaps are in kernel space. I've said this many times. Further, if they are in kernel space they are completely completely hidden, so you contradict yourself.

If a cap that should not exist arises on your system you must have the ability to detect the cap's existence, revoke the cap, determine who (which user/s) broke your security, and what (which program/s) enabled them to do so, whether accidentally or purposefully.

What is the criteria for determining that a cap should not exist? I can't even think of a situation in the ocap model where you could create a capability that should not exist. How do you express the policy that a cap should not exist? Who or what expresses this policy? Who or what enforces this policy? Why does this agent have the right to dictate and enforce these policies? I'm afraid this is not so simple as you imply.

The rest of your requirements are already handled by the ocap model, with the caveat that root is not needed, nor are identities. In fact, only a few primitive kernel-backed objects are needed to bootstrap such a system, the primary one being a "Space Bank", which is a system storage manager. It doles out pages and capages to objects, and those objects can do anything they like with them, and a "user account" is merely a program that aggregates storage and services on behalf of a user.

An administrator does not need to know what accounts or other programs are doing with storage, but if they are abusing the storage managers or other delegated services, the admin can simply revoke access to them or simply reclaim the storage to destroy the account. Accounts are managed via Horton-like protocol to assign responsibility. Again, there is little use for built-in identities.

I think you're putting the cart before the horse, starting with complex high-level concepts and trying to build a coherent system from primitives with imprecise semantics, instead of starting with a small set of simple services which can be composed to build the high-level services you want. I highly recommend you read the EROS/CapROS papers to see how this could be done.

DRM != Hardware Protection

What OS are you going to write that doesn't fundamentally depend on hardware protection?

I think Ray is concerned about hardware measures that guard the data from the user who owns the machine. For example, enforcible DRM relies on the presence of authority held by the DRM software that is not wieldable by the human who owns the machine.

Ray's concern is valid. It is easy to conceive of designs in which my machine can be turned against me. One of my concerns about the iPad mess is that Apple appears to be headed in this direction.

But there is a middle position. In KeyKOS/EROS/Coyotos, a subsystem can hold authority that you cannot get, but it cannot hold that authority without your knowledge. It is possible to ask, before instantiating a subsystem, whether it's initial authority set {IA} contains capabilities not in your authorized authority set {AA}. If it has such a secret, you are free not to run the program. Those systems also have a notion of an opaque "bag" of capabilities, so it's possible to give every user a bag of (say) all capabilities granting access to the network The bag can then be used in this comparison without granting to the user all of the access that is sitting in the bag.

Ultimately, however, this concern serves to illustrate why the ability to do core system audit and public reproducibility becomes important. The more secure the system becomes, the more critical it is for the process of its fabrication to be openly auditable.

I think Ray is concerned

I think Ray is concerned about hardware measures that guard the data from the user who owns the machine. For example, enforcible DRM relies on the presence of authority held by the DRM software that is not wieldable by the human who owns the machine.

Personally, I think that's a pretty hopeless effort, both legally and technologically. You can't protect a machine from its owner, you can only make it more difficult. If it turns out to be difficult, a market will spring up to satisfy that need. Look at all the jailbreaking/unlocking services provided for cell phones and the iPhone in particular. An ocap system used for DRM can make life more difficult for an owner, and may require the use of a hardware hack, but it will never be unassailable.

Unfortunately, it's not hopeless.

I don't mean to get into a religious debate, but it isn't.

If memory (both data and program) is well encrypted using an asymmetric key, and the decryption key is on the CPU and never leaves it, then you can make it effectively impossible to read the machine code or the data it's manipulating off any motherboard traces or buses. That was the vision behind TCM/Palladium.

Compiling a program for the machine, or reading unencrypted data from external media, would require use of an encryption key, but user knowledge of that won't enable you to actually read the encrypted stuff as it goes past the CPU.

Or it could be done other ways. You could put a whole functioning computer, including memory, on a single chip. Now there are no traces to monitor

Encryption is not

Encryption is not unassailable either. Just look at all the successful timing and MITM attacks against SSL, and those are remote exploits. The additional bandwidth available to local programs will simply extract the key even more quickly.

Most security is not broken, it is circumvented. Don't remember who said that, but it will probably always be true.

Ray is correct

While I agree with naasking that encryption can be attacked, the level of practical difficulty in doing that in a modern DRM system is quite high, and the social cost of success (jail) is also high. That is: naasking's approach may work for a few hackers, but it doesn't work as a collective, societal defense against the privatization of data.

There is a view in many parts of the open source community that this "privatization through technical means" is an inherent threat to freedom, that the enforcement of access control or copy control through (de practico) unassailable technical means fundamentally alters the balance of freedoms in notions of copyright and free expression, and that this type of freedom of expression and re-use is one of the pillars of a modern, open society.

This view is correct, but as in so many things it cuts both ways. The only answer is legislated weakening of encryption, but note that this simultaneously defeats privacy, and that legislative rules can be gamed. Which is why copyright, originally intended to be 14 years and extended another 14, now stands at 95 years.

This is a discussion that (in my opinion) is far off the intended topic area for LtU, so I won't belabor it, except to note that the impact of enforceable safety is not unilaterally good, and that neither programming languages nor operating systems can be responsibly designed without consideration of their social impacts.

I think Ray is concerned

I think Ray is concerned about hardware measures that guard the data from the user who owns the machine. For example, enforcible DRM relies on the presence of authority held by the DRM software that is not wieldable by the human who owns the machine.

This is exactly the case. If capabilities are literally unreadable and unsearchable, by anybody, then I am concerned that machines I buy in the future may have provisions in ROM that grant excessive capability to some external agency or to programs provided by that agency, and that as the machine owner I will not have the capability to detect or revoke them.

Ordinary page-table security does not bring up this same problem as much. The values of, eg, memory pointers have properties that make unauthorized tampering with them far less likely to yield a result desired by some outside party willing to pay money for that result. Also, changes in those values are far more likely to be immediately detected via (and the damage done limited to) a crash.

Perhaps I am excessively paranoid about this, but people have been throwing stupid amounts of money and power around trying to make it happen, so I am extraordinarily wary of infrastructure that could enable it.

Don't need capabilities for that

I am concerned that machines I buy in the future may have provisions in ROM that grant excessive capability to some external agency or to programs provided by that agency, and that as the machine owner I will not have the capability to detect or revoke them.

You don't need capabilities for that. All you need is one-way disable bits (e.g. the VMX feature disable bit on Intel hardware). Or third-nation fabs. This is why DARPA is presently so concerned about the security implications of the ever-declining American chip fabrication capabilities. There have been real examples of security breaches built into hardware in this way.

Merit of audit unclear

In my universe, secrets have to be accessible somehow. The machine has an owner or root authority, and the owner has to be able to know, with certainty, what capabilities exist on their system, audit them, revoke them as necessary, verify the NON-existence of certain caps, etc.

This depends greatly on your security policy. All of the administrative operations you describe can be accomplished with "get capability bits" and "write capability". None of them require the authority to read the existing capabilities as authority.

But you presuppose that the administrator has the right of search. There are many systems in which this is undesirable, and the argument for why it is necessary has never been supported by any sort of comprehensive analysis or investigation.

In any case, there is no reason for the system root to disclose capabilities to the administrator as authority. As an analogy, the JVM does not disclose raw pointers to the user.

Caps that should not exist

If a cap that should not exist arises on your system you must have the ability to detect the cap's existence, revoke the cap, determine who (which user/s) broke your security, and what (which program/s) enabled them to do so, whether accidentally or purposefully.

In a properly designed capability system, this is a logical impossibility. Capabilities do not "arise". They exist (logically) from the moment the machine is turned on. The problem, then, is not the existence of a capability, but the fact that some program you do not approve of has come to hold some capability that grants authority you care about.

But the only way this can come about is if some first program F transferred (i.e. gave) that capability to the undesired program P. So the mere fact that you are in this state means that you boogered your security policy from the get-go.

There are a variety of means to prevent this and/or to provide for later revocation that you seem to be disregarding. Perhaps before making pronouncements and judgments about how delegation "must" be handled, you should attempt to state the requirement you think you want and ask whether/how it is handled already.

If you're inclined to do this, most of the best capability brains in the world are over on the cap-talk list, so that's the right place to do it.

Boogered up configuration

But the only way this can come about is if some first program F transferred (i.e. gave) that capability to the undesired program P. So the mere fact that you are in this state means that you boogered your security policy from the get-go.

Exactly! And I want to be able to watch the "instant replay" to see exactly where the cap came from, how it was transmitted to the program that I don't desire to have it, and thereby figure out exactly how I boogered it up.

This could happen when, eg, one or more programs does not behave the way I thought they did and delegates a cap that I didn't think they'd delegate, or delegates it to another program I didn't think they'd delegate it to.

Anyway, if I have a dynamic security system, then it can be arbitrarily complex. If it can be arbitrarily complex, then it can be misconfigured (boogered up) and need debugging. If it's my job to debug it, I want information about what went wrong and why.

There is a catch-22 here

The same means that allow you to debug it generally allow an adversary to debug it.

In any case, as I said, solutions exist to the problems you are raising. Perhaps you should look in to them before inventing new ones.

Thanks everybody.

I just wanted to thank everybody who's posted here.

Despite acting a little bullheaded about what I want from a security configuration, I have learned a fair amount from this discussion. Also, thanks to those who've pointed me at further reading.

It turns out that capabilities work mostly the way I thought they did, although my understanding was incomplete. In particular I hadn't considered the application of cryptographic digital signatures to proving caps, nor the idea of different parts of a process (or individual objects) having capabilities not held by the program or process as a whole, and it seems I more-or-less assumed use of something like the Horton protocol without knowing its name or it had been developed separately.

Anyway, thanks.