Safe interior references in the presence of mutation

Suppose like Rust & C/C++, we want to support references that point to the inside of another value. A reference<t> is something that can be dereferenced (deref : reference<t> -> t). A reference may also be mutable, so that we can set its value. It may also support other operations, such as walking a data structure. Examples:

  1. Given a struct x { a:int, b:int }, we can obtain a reference y=&x.a. Later on, we can read from that reference.
  2. Given a sum type, we can pattern match on it and obtain a reference to the value inside of it.
  3. Given an array, we can obtain an iterator of the array. We can dereference the iterator to obtain the current value, and we can call next() to obtain a reference to the next item. The same goes for an iterator over a tree.

Note that "reference" in the sense I'm using is not necessarily just a pointer. A tree iterator may maintain a stack of parent nodes in order to be able to traverse a singly linked tree.

There's one nasty problem with references, which is that in the presence of mutation they can lead to memory unsafety or run time errors. Examples:

  1. Given the sum type (Either Int String), and x = Left 3, we can pattern match on x and obtain a reference r to the value 3 inside it. If we then mutate x := Right "foo", now the reference r is no longer valid. If r is represented as a plain old pointer, then if we dereference it we will try to read an int where now a string is stored.
  2. Given a tree iterator over some tree, we can go to somewhere in the middle of the tree. If we now mutate the tree, the iterator may no longer be valid.

As you can see, it's not just a problem with pointers. Even in a language like Java, you have problems with iterators when the underlying data structure is mutated. At run time they may throw ConcurrentModificationException.

Rusts type system attempts to prevent these kind of errors statically, but Rusts type system isn't fully finalized yet and it is quite complicated. I was wondering if there is existing literature on this problem and perhaps there is some elegant way to prevent this with types. It would probably involve linear types and capabilities.

If such a thing does not exist yet, one thing I can think of is to require a linear capability in order to mutate the value, and in order to obtain a reference to the inside of the value you must give up this capability. In return you get a capability that allows you to read the references. If you later want to mutate again, you have to give up the capability to read the references. Is this a good approach? Why/why not?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Defer the generalization of monomorphic variable to module

Just go parametric polymorphism and defer the generalization of a referenced variable up till a module boundary, or class definition.

At least. That is what I hope will work for me at some point when I get interested in writing another interpreter. No idea, really.

Mezzo

Mezzo's been mentioned a couple of times recently, and while I haven't investigated it much yet, it sounds like it's squarely in this problem area.

Dependent products

A good elaboration on this topic is in this reddit comment. Basically, the author thinks that the problem arises because e.g. sum types are dependent products (pairs), where the type of the second component (the payload) depends on the value of the first component (the tag). Therefore, as long as a reference exists to the second component, the first component must not be modified, for the type of the reference to remain valid.

More than generalisation

Although generalisation is a problem with mutable references - there is a rule of three here: polymorphism, references and mutability, any 2 are fine, but all three together doesn't work. The case of 'inner' references seems more complicated. I would probably look to do something with scoping, so that when pattern-matching introduces a new nested scope that holds a reference to an inner-field, the path to that field would be immutable.