Object capabilities for protecting object slots in prototype-based languages?

Hello LtU community,

As this is my first post here in LtU, I'll briefly introduce myself. My name is Denis Washingotn, and I am an end-of-first-semester CS student at the Humboldt University Berlin, Germany. I am very interested in programming language theory and did much personal research on it in the last few years, including writing an LLVM-based compiler for a safe statically compiled C-like language in C++ as a hobby project. (I actually planned to release that as open source and had already implemented several control structures, boolean and number types, arrays and first-class functions, but lost interest because after more and more research because I increasingly found the language to be too "baroque" and unexciting.)

Nowadays, my interest lies primarily in highly dynamic, reflective object-oriented languages; in fact, I am in the process of trying to design a prototype-based language vaguely in the vain of Self, NewtonScript and Io, with a special emphasis on conceptual simplicity and reflective capabilities a'la Smalltalk while trying to maintain a reasonable level of approachability for users of established mainstream scripting languages (like Python, Ruby, PHP, etc.).

In relation to this, I have been investigating the possibility of effective encapsulation of state for highly modifiable slot-based objects as found in several object-centered languages. Regarding this I found a paper on the encapsulation mechanism of Self prior to version 3 (Parents are Shared Parts: Inheritance and Encapsulation in Self) where slots could be declared as "private"; such slots can then only be referred to by objects of the same delegation family. (However, according to the comments on the selflanguage.org discussion forum, the privacy declarations have been degraded to pure annotations in Self version 3 because it "was found to be unworkable", partly because the protection mechanism could be easily circumvented due to Self's dynamic inheritance, which is also mentioned in the paper.) I also learned about a seemingly well-known JavaScript pattern where private data is modeled as capturing of local variables during object construction, but I find this scheme to be quite unpractical as it requires any method needing to access private slots needs to be effectively copied for each object clone, which defeats message delegation. Other than that, I haven't found anything concrete on the topic yet.

So I thought about the issue myself, and came up with an idea that I would like to share with you. Suppose an object is a collection of slots, each of which associates a name symbol with a value. It is possible to retrieve a slot's value from an object by sending it a message, "getSlot", with the name of the slot as argument. Now, further suppose that name symbols are themselves objects, and are unique: there are never two symbol objects which denote the same name. So, in fact, objects slots can be thought of as object-object (instead of string-object) associations. We could thus generalize: instead of special name symbols, "getSlot" could instead allow any object to act as slot key, making it possible to to associate any object with any other in the context of a specific object.

Now my idea: given the above, it is clear that in order to refer to a slot in an object, two things are needed: the object itself, and the slot's key object. This means that without a reference to the slot key object, it is impossible to access to corresponding object slot. In the simple case that all keys are name symbols, this is a trivial statement, as in most languages, it is possible to refer to any symbol from any context by the way of literals. But if slots are modeled as generalized object-object associations, it is possible to create a key object which is only available in the context of its creator, and which may be used to control access to the slot by passing it only to privileged parties. Thus, such a system would enable the programmer to use arbitrary objects as capabilities for accessing a slot, in the spirit of existing capability systems on the object reference level.

This feature could be used as a powerful way to encapsulate slots in any level of granularity. For instance, in a language with full block closures, one could write something like the following to define slots which are private to a specific set of object methods (Io/Smalltalk-like pseudo code):

o := Object clone

[
    priv := Object clone

    o foo := method [
        ...
        foo := o getSlot(priv)
        ...
    ]

    o bar := method [
        ...
        o setSlot(priv, "bar")
        ...
    ]
] value

(The [ ... ] value should signify that the code is run in a private local activation record to ensure that priv does not leak out to the surrounding namespace.) These methods could also grant other objects the right to access the slot by sending them the priv object, which would amount to something like a dynamic version of the C++ "friend" feature. Numerable other arrangements are thinkable.

What do you think of this idea? Is it sound? I know that this requires careful choices regarding the reflective features of a language with such a protection mechanism; for instance, it shouldn't be possible to enumerate all slot keys of an object (as in Javascript). But except for this, is there anything I have overlooked? I am sure I'm not the first one to have this idea. ;) Any feedback on this is very welcome!

P.S.: Lambda the Ultimate help me tremendously with my PLT research, so a big thanks to everyone who is posting here! You helped me a lot in the last years, and I am absolutely certain that you will continue do so. :)

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Not the entire method, right?

I also learned about a seemingly well-known JavaScript pattern where private data is modeled as capturing of local variables during object construction, but I find this scheme to be quite unpractical as it requires any method needing to access private slots needs to be effectively copied for each object clone, which defeats message delegation.

Though it isn't as efficient as it could be, I don't think you'd necessarily need to copy the entire method. For example, what about just creating a trampoline that keeps track of the private variable closure and forwards it along to the "real" method?

Or is this already the overhead you were concerned about?

Overhead is a secondary issue

Yes, I have implied an efficiency concern with that statement, although I haven't been giving it much thought; as you have mentioned, there are ways of implementating this which are probably sufficiently performant. The main concern I have, however, is how this influences the prototype-instance relationship: because for the methods are explicitly assigned to each new object, the prototype inheritance relationship is lost. That is, if I do the following:

function Counter() {
   var value = 0;

   this.increment = function() {value += 1;};
   this.getValue = function() {return value;}
}

var c = new Counter();

then I cannot change the implementation of all Counter's increment() functions, like so:

Counter.prototype.increment = function() {print ("Incremented!");}
c.increment(); // still the old version

because c has an explicitly set value for bar and thus doesn't query its prototype. For "normal" methods (that is, those only assigned to Counter.prototype) this would work, however. Thus, you need to know beforehand which methods are private member accessing and which not, which is awkward. (I know the example isn't very illustrative, but you probably get the idea.)

Fake reference?

I think that one crucial point about 'object capabilities' is that they cannot be faked|guessed, so another constraint of your system is that it must not be possible to fake object references.

If you don't have deserialization, that's easy I think, if you do have this feature then object references must be something different than simple pointers (perhaps a pointer and a big unguessable number).

Thanks

Thanks for pointing out serialization, I haven't thought about that. However, I didn't plan for any form of built-in runtime-level serialization mechanisms in my language, so this shouldn't be an issue. Object references are naturally designed to be unforgeable (that is, there is nothing like pointer arithmetic or a global object table).

OCaps vs destructive reflection

Shouldn't setSlot be protected by an ocap? Scenario: your system has a "console" object that's used widely for console I/O. You download some untrusted code from wherever and your system passes the console object to the new code with the print reference because the docs said it needs to log some stuff. The untrusted code does a "console setSlot(print, doNothing)". Now suddenly none of your code can print to console.

Or do you always do a clone of any ocap before passing it to untrusted code?

True

The model is unfortunately not sufficient to model separate read and write capabilities; the object key is always a capability for both. There are ways around that, like introducing the ability to "freeze" slots make them immutable, or to only pass a key to a new slot which returns the actual slot's value through a method; but I admit that these are not optimal (though maybe sufficient in practice; in any case, the assumed main use of this scheme, which is slots private to a single object - or a group of mutually trusting objects -, does not seem affected by this restriction).

Destructive Reflection

Nice catch.

setSlot( getSlot, \ slot -> "Muhuhahahaha!" )

do you always do a clone of any ocap before passing it to untrusted code?

If you need special protocols to handle 'untrusted code', you certainly don't have an ocap system.

Oversimplified base case

I may have oversimplified a bit by bringing "getSlot" on the table. I guess that for this to work, slot lookup needs to be intrinsic to the system, and this is actually how I planned it for my language. More specifically, a slot of an object can only be accessed by sending that object a message with the key as selector. As in Self, if the slot contains a method object it is executed and its return value is the result of the message send, otherwise the slot's value itself is returned.

So, no "getSlot" overriding madness. ;)

Not Quite Capabilities

What you present above is actually a rights amplification pattern... i.e. to access a slot, you need both an unforgeable reference to an object, and an unforgeable reference to its key. (You could just as easily use a password, GUID, or certificate for the latter.) Authority is therefore being mediated through this rights amplification protocol - similar to certificate authority systems or ADTs with first-class modules. By comparison, ocaps are all about making permission indivisible from holding a reference, which by nature excludes rights amplification.

With rights amplification, you cannot be assured that the recipient whom you pass an object reference will have the same rights to it you have. This can become a problem for reasoning about security. (You can model rights amplification protocols, password systems, identity-based authority, certificate authority, et cetera using ocaps, but that doesn't mean these other models represent ocap systems.)

(I'm not saying your system is 'bad', just that your description is a bit misleading. The JavaScript approach of hiding caps and cloning methods is far more direct a way to protect private elements by use of ocaps, i.e. by simply not sharing references to new variables. I'll write a separate post to discuss quality of your solution.)

I understand

I understand your objections to using the term "object capabilities" in this context. I was actually just referring to the analogy to object capabilities regarding access control through references and unforgeability, but you are right that this makes objects themselves move away to be object capabilities in the pure sense of the word (in that a reference to an object may mean different capabilities in different contexts depending on which keys are available in these contexts).

However, the password analogy you mentioned is a bit misleading, too, as passwords are forgeable - they are inherently based on being "only" hard to guess, while I am explicitly talking about the use of unforgeable keys (I know you actually noticed that, I just wanted to highlight the flawed analogy).

securing metaobject protocols: mirrors, proxies, views, ...

Hi Denis -- you hit on a really tricky issue challenging capabilities for modern dynamic languages: the meta object protocol!

You might want to poke at Gilad Bracha's work on mirrors (for Newspeak?), Mark Miller's on Proxy objects for ECMAScript / Caja, and our joint work on mixing in the single origin policy with JavaScript (object views) (we first locked down the MOP and then started playing with exposing it with our extension to the membrane pattern). Short of rolling your own language, you might find some our the base libraries for controlled passing of references across browser frames to be a fruitful starting point.

As for the performance issues, there was a thread about this last summer or fall, but, essentially: the indirection table in Smalltalk gives you a lot of flexibility (I think something similar was adopted for proxy objects in ecmascript), PICs are another approach for folding in mostly-static accessors (you typically wouldn't be changing them every 1000 instructions), and, of course, tracing would also remove the overhead. Addressing the issue for compilation approaches used in static languages seems like either an anachronism or a mismatch for the dynamic nature of the ability to dynamically control the MOP.

While you're reviewing the literature

While you're reviewing the literature, also have a look at Web Sandbox. It was influential in establishing the proxy model of capabilities for JavaScript, and while it's still a research project, it is currently in use as an DOM isolation technology for Hotmail.

Thoughts on the model

The key-based solution you present could just as easily be modeled by storing all the state in the key as in the object. That is, getSlot(foo,bar) could be implemented to look inside 'foo' or 'bar' or even a global relation inside 'getSlot'. From the semantics perspective, it doesn't make much sense to distinguish one or the other as bearing the dictionary (unless some other op makes it relevant).

It seems this keyed approach could easily introduce hidden channels for purposes other than hiding private data.

If your "interest lies primarily in highly dynamic, reflective" systems, then I caution you against prototype-based OO. In large or long-lived prototype systems, such as a LambdaMoo, decisions and data easily become entrenched deep in your object model, and the system stagnates. It becomes difficult to extend the system for processing outside your initial design. For highly dynamic systems, you need to support multiple, consistent, parallel data models. Decisions made in one model must propagate to the others, possibly coordinate with them. You'd be better off pursuing bi-directional and reactive programming, or pluggable architectures.

I haven't seen enough work

I haven't seen enough work in bidirectional nor reactive systems to substantiate claims about large, long-lived systems (and I am aware of large, long-lived projects that use them). Furthermore, I don't typically associate bidirectional languages with reflective ones -- they're very constraining in practice, at least today, which isn't fun when working with reflective (and, I assume of interest, introspective) abilities: that's hard code to analyze, so inverting it is a bit of a reach. Likewise, bidirectionality is typically orthogonal to dynamism (they don't mix well, so they're shielded from each other whenever I see a system with both). Overall, I'm confused by your positioning of these ideas.

This isn't a critique of the ideas. They're fascinating -- I used to make reactive languages and am now doing the runtime optimizations for, in part, a bidirectional language.

My experience...

Most of my dayjob work is in large, long-lived, reactive systems. However, the systems I speak of are architecture-based rather than language-based... e.g. publish/subscribe, blackboards or tuple spaces. These architectures have proven very flexible (and moderately scalable) for integrating and independently developing new components and extensions.

While these architectures are flexible (aka dynamic) 'in-the-large', it turns out they are rather painful to work with 'in-the-small'. The languages we use (C++ and JavaScript, mostly) are simply not well adapted to the architecture. Managing subscriptions and caches are especially painful.

On occasion the 'native' data format (the initial view presented to modules) happens to be the view we need to make good decisions and take actions. In that case, all we need is a simple subscription, or simple updates. Since it simplifies our lives, we try to ensure this is the case as often as possible by intelligently designing (and redesigning) the data model.

But there will always be some user story that needs a different view. A common case I work with is adapting to new protocols. In those cases, a module is forced to gather a bunch of scattered data into a new view, often managing a large number of subscriptions, dropping or adding subscriptions based on changes in the data. Keeping a view 'consistent' with the native model is a non-trivial exercise. Eventually, we make decisions and take action. These cannot just be reflected in the local 'view' we have constructed; our decisions must propagate back to the native model (and then further, to sensors and actuators). Thus, we also maintain the relationship from the view back to its source.

The fundamental requirements seem to be reactivity (to maintain multiple, consistent views in a concurrent system) and bidirectional mapping (so decisions and actions made on a view will propagate to its source). This becomes difficult when a 'view' comes from composing observations from diverse elements in some other model - which is not uncommon. Multiple models and bidirectional influence, together, give us dynamism - the ability to extend the system with new protocols, new data sources, new ideas, new domains and disciplines.

(Modeling everything as data is also useful for highly dynamic, reflective systems. For my dayjob project, the design we came up with added method-calls separate from the data model for reasons of 'efficiency'. We have since come to regret that decision because it hinders extending the system with replay, auditing, and post-mission analysis. Were we to do it over, every method-call would be associated with a fresh data entity.)

You say bidirectional is 'very constraining', but I do not mean to suggest our languages should constrain us such that every function and data-binding must be bidirectional and lossless. I'd be very happy if our languages simply made the bidirectional properties a lot easier to achieve and compose. At the very least, we need to more tightly couple the notions of 'reactive', 'view', and 'control'. It is only natural that, when we see something, we want to reach in and touch it. The problem with most OO data models is that such changes tend to modify the view rather than the cause for the view. My own efforts towards more bidirectional programming are based on the notion of demand-driven systems, e.g. a video camera might power on only because it is being viewed. Parameterized demands can serve both as queries and constraints.

I've read a lot about your work with flapjax. I don't see anything on your perpetual-student page about bidirectional programming. What is it you're working on?

Ah, I misunderstood reactive

Ah, I misunderstood reactive and bidirectional systems -- which are inherent properties of the program -- with language-based techniques for achieving them. My point was just that we don't know have proof that the language-based approach is the way to go or how for systems of scale; they're awesome, but very much research.

Thibaud Hottelier is designing a bidirectional layout and animation language. As one backend, a synthesizer finds attribute grammars to compute various directions of flow, which feeds into my part, a fast (parallel, SIMD, memory-optimized) AG solver, and a more murky story on the rendering (OpenGL and Qt today, hopefully canvas and OpenVG soon). As a frontend, you can use a lot of the normal syntactic abstraction facilities of the browser -- xml, selectors, cascading -- optimization problems we've done great with already wrt parallelism.

we don't have proof that the

we don't have proof that the language-based approach is the way to go or how for systems of scale; they're awesome, but very much research

I grant this is true. On the other hand, I've seen plenty of evidence that prototype-based OO does not retain 'highly dynamic and reflective' properties at scale. I believe seeking alternative solutions among architectures and experimental language design is quite rational. Taking the well-trodden path of prototype-based OO and expecting different results (without significant changes) is insanity.

Your new project sounds interesting. I'll look into it over the next week.

prototypes and layout

On prototypes: I used to do a lot actionscript development: there, the ability to manipulate the central MovieClip object was powerful. Unfortunately, the common use of prototypes -- js for the DOM -- crippled the main object of manipulation (dom) and now, while finally more consistent, the security camp (cough capabilities) put the nail in the coffin. I agree about the comments about scale and introspection. The dynamism isn't amazing, but does serve a purpose -- however, as
seen in JS's continuing evolution, the MOP needs to be rich (eg, dynamic accessors).

We haven't written much up recently. There was a rejected paper last year on the constraint language and I have a thesis proposal draft touching part of it; we've made great strides recently and will probably write up a few things this year (css semantics, new parallel algorithms and language implementations, etc.)

oops wrong place

oops wrong place (bad phone, bad!)

Lua

Keys in Lua tables are arbitrary objects, and it's a well-known idiom to use an object as a key (instead of a string) in order to avoid accidental collisions.

(Lua doesn't attempt to be capability-secure.)

MOO

Some prior work in this area was done by Pavel Curtis in his MOO language and system: ftp://ftp.research.att.com/pub/eostrom/MOO/html/ProgrammersManual_6.html#SEC6

It was nothing like capabilities, and there were problems with it in practice, but it was an early and interesting approach to the problem of providing controllable access to sub-parts of an object in a multiuser setting.