Morrow & First-class Labels

First-class labels for extensible rows. (draft)
Daan Leijen.
Submitted to POPL'05. (PDF, BibTeX)

Abstract: This paper describes a type system for extensible records and variants with first-class labels; labels are polymorphic and can be passed as arguments. This increases the expressiveness of conventional record calculi significantly, and we show how we can encode intersection types, closed-world overloading, type case, label selective calculi, and first-class messages. We formally motivate the need for row equality predicates to express type constraints in the presence of polymorphic labels. This naturally leads to an orthogonal treatment of unrestricted row polymorphism that can be used to express first-class patterns. Based on the theory of qualified types, we present an effective type inference algorithm and efficient compilation method. The type inference algorithm, including the discussed extensions, is fully implemented in the experimental Morrow compiler.

Always trust Daan to come up with something both elegant and practical...! However the examples involving bottom (undefined) labels left me skeptical.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

No comments?

I'm disappointed. I thought it was a pretty interesting paper. And I know Daan is checking this page regularly, eager to answer your questions...

OK

I'll bite, although I'm probably one of the least qualified people here either to pose a meaningful question or to understand the answer to one.

In Python, the label of an object's attribute (or method, as may be - methods are attributes whose values happen to be functions) is simply a string: the attribute's name, the key with which we can retrieve it from the object's dictionary). A Python object is somewhat like a record, in other words, and much of Python's "dynamic" quality derives from this fact.

Python programs often behave as if attributes of different objects with the same name were equivalent in some meaningful way (e.g. as if those two objects were of the same class, or related classes, or implemented a common interface). This means that the latent typing of a Python program is managed by a fundamentally brittle intermediary: when we ask for foo.bar(), we ask for it by name, and any attribute of any class of object that answers to the same name will do.

If we have first-class labels, and by extension first-class messages, then can we make stronger assertions about the equivalence of two attributes with the "same" label, and check those assertions statically? That sounds like a win to me (desirable feature of "dynamic" languages implemented in statically-type-checked context). Is that at all related to what's going on here?

It goes without saying

that I haven't yet read past the first few pages of the paper, and only really skimmed those. Like the others said, it takes time (amongst other things) to do justice to this kind of thing. So if my question is way off the mark, the obvious explanation is probably the correct one.

Labels and Python

If we have first-class labels, and by extension first-class messages, then can we make stronger assertions about the equivalence of two attributes with the "same" label, and check those assertions statically?
Yes, we can! To be precise, the type inferencer will check that the type of the selected value will match the use of that value. We can see that in the type of label selection:
(_._) :: (r\l) => Label l -> {l :: a | r} -> a
i.e. When a label l is selected from a record containing a l label with a value of type a, we get back a value of type a. In contrast, the "type" of attribute selection in Python could be given as:
(_._) :: (r has l) => Label l -> r -> a
That is, for any record (or object) r that contains label (or attribute) l, we get the value back of some (dynamic) type. The a is solely determined by the use and it is only checked at runtime if it matches the actual type of the value in the object.

The above is already possible with normal records, but with first-class labels we can also describe code that allows us to select some value from a record that is an instance of a certain type -- all statically checked. For example, here is a function that selects some label that has an int-to-int type:

intToInt :: (r/l) => {l :: Int -> Int | r} -> Label l -> Int
intToInt r l  
  = 12 + r.l 42  
I hope this helps,
-- Daan.

Thanks!

It appears that sometimes it is possible to say something pertinent by accident. Now I have to read the rest of the paper...

We had a conversation here a while ago with the heading "The case for first class messages", that in no time at all veered rather sharply off-topic. Perhaps the topic is worth revisiting?

It takes time...

It looks interesting but I didn' thave the time this week to do any real thinking...

In my experience the more in depth and interesting a paper is, the longer it takes for people to respond. Makes sense, doesn't it?

Second

The paper's fascinating. I'm with Ehud, though: it's sufficiently dense and chewy that I'm still masticating, so digestion is going to have to wait a bit.

OK, one immediate thought: is there a relationship to Type Theoretical Foundations for Data Structures, Classes, and Objects?

Thoughts about that..

Hi Paul, I have not read this thesis yet, but browsing through it, it seems that the focus of Kopylov is a theoretical encoding of records for type theorists. I try to give a practical type inference system for records for first-class labels.

However, there may be a big difference between his system and the one described by me. In my system, the extension of a record r with a label l is only valid if the label does not yet occur in the record. This is captured in the "lacks" constraint (r\l).

{ l = _ | _ }  :: r\l => a -> {r} -> {l :: a | r}
In contrast, Kopylov chooses to make the "extend" operation always valid: that is, if a record already contains a label l, it is overwritten. It seems to me that this precludes most general types. Here is an example:
bar :: {l :: a} -> a        
bar r  = r.l

foo r  = bar {l = 1 | r}
What is the type of foo? It can have two types, and neither is more general than the other:
foo :: {} -> Int
foo :: {l :: a} -> Int
Since we have this problem even when we take the intersection of the value types, I doubt it that Kopylov's system is suitable for type inference in practice.

btw. This example comes from Mitchell Wand in a corrigendum to an earlier article about type inference for records where he also used liberal record extension:

Wan87 Wand, M. Complete type inference for simple objects. In Proceedings, Symposium on Logic in Computer Science. IEEE, 1987, pp. 37-44. Corrigendum in Proceedings, Symposium on Logic in Computer Science, page 132. IEEE, 1988.

Subtyping

This is not a problem if {l :: a} is a subtype of {}.

The tarpit of subtyping :-)

If we assume subtypes, we already have trouble with type inference. For example, in standard type inference, the identity function has the following principal type:

id :: forall a. a -> a
id x = x

Now assume we have subtyping and that Int Float" to the identity function which is not an instance of "forall a. a -> a". Doing type inference in this case entails adding special subtype constraints and is pretty complex (here is a nice introduction)

I personally feel that subtyping should not be implicit. In Morrow, subtyping is always explicit in polymorphic extension. For example, the type signature:

f :: {l :: a} -> a

means that "f" takes a record with a single field "l" as its argument. In contrast, the function "g":

g :: { l :: a | r } -> a

can take any record that contains a field "l" as its argument. In a language with subtyping, the signature for "f" implicitly equals the one for "g" -- while I believe that there is a real difference. (but than, I am not really an expert on subtyping, so you should take my opinion with a grain of salt :-)

Reuse

Yes, there is a difference between f and g. But why would you limit a function to only {l :: a}, while it works perfectly on every record of the form { l :: a | r }. It only limits the reuse of your function for no good reason.

Final classes

But why would you limit a function to only {l :: a}, while it works perfectly on every record of the form { l :: a | r }.
I see this as similar to final classes in Java - the author of a final class specifically precludes others from extending the class. This impacts reuse, but helps to establish a closed world model on some small scale.

One of the differences between finals and closed record types is who decides to close the type - with finals it is the author of class, with records it's the author of function. This again looks like a mirror of classical dichotomy - in OOP approach a set of operations tends to be conservative and a set of types (classes) more open, while in FP it's the other way around (I mean "orthodox" OOP and FP, of course there are many mechanizms to bridge the gap).

Given type

In that case, we can give the type "Int -> Float" to the identity function which is not an instance of "forall a. a -> a".
I am not sure I understood what do you mean by "give". The function already has an explicit type annotation. Do you mean "we could write Int -> Float instead of forall a. a -> a"?

(but than, I am not really an expert on subtyping, so you should take my opinion with a grain of salt :-)

And I am not an expert on typing, so you should not take my opinion at all :) Seriously, I do not see implicit record subtyping as suspicious, though I agree that implicit Int <: Float can cause problems in some cases. (I also find Polymorphic Subtyping in O’Haskell to be a good source).

PS: nice pictures and snowboard guide :)

Is {l :: a} a closed type?

Daan Leijen: In contrast, Kopylov chooses to make the "extend" operation always valid: that is, if a record already contains a label l, it is overwritten.
Technically, he does not overwrite values, but intersects them:

{x : A; x : B} = {x : A ∩ B}

Daan Leijen: I doubt it that Kopylov's system is suitable for type inference in practice.
Yes, some of the properties look impractical:
Alexei Pavlovich Kopylov: Thus, we are allowed to have recursive functions as long as we can prove that they terminate on any input from their domain. Of course that makes type-checking undecidable.
I am not quite convinced this invalidates the idea of intersection types.

Daan Leijen: What is the type of foo? It can have two types, and neither is more general than the other: foo :: {} -> Int foo :: {l :: a} -> Int
In which type system? Judging by syntax, it's not Kopylov's. Assuming we are still discussing shortcomings of his system and I interpret the syntax correctly, I have two observations:
  1. The type of this function is probably neither of these. Informally, if parameter has no label "l", then result is of type Int, otherwise it's of type Int intersected with the type of field with label "l". That is,
    foo : {} -> Int
    foo : {l : A} -> A ∩ Int
    
    and if we consider {} equivalent to {l : EmptyType}, and Int equivalent to EmptyType ∩ Int, then
    foo : {l : A} -> A ∩ Int
    
  2. I believe you use {l :: a} to denote a closed type of records with exactly one label "l", which (I think) is not directly expressible in the system in question. Therefore foo just cannot have such type.
I might have misinterpreted your comments or Kopylov's thesis, and will be glad to hear more.

Misinterpretation?

Andris
I might have misinterpreted your comments or Kopylov's thesis, and will be glad to hear more.
As I said, I just quickly browsed Kopylov's thesis and I think that I misinterpreted the way we can assign record types. I am actually fascinated by the principal type of "foo" (under your interpretation of his thesis)
foo : {l : A} -> A ∩ Int
This assumes though that:
{} equivalent to {l : EmptyType}, and Int equivalent to EmptyType ∩ Int,

My intuition says that this is a rather bold design decision: it basically means that every record value contains every possible label, where the value is of type EmptyType. Wow!

Furthermore, I still believe that allowing record "extension" even when a label is already present is just not the `right thing', even when one assigns an intersection type instead of overwriting; but I can't make this intuition precise (yet). Hmm, from a practical viewpoint, I think it will lead to dynamic checks/labels at runtime and precludes constant time selection.

Thanks though for pointing this out as it provides a new incentive for me to (try to) read Kopylov's thesis in depth. I also wonder how this relates to Reynold's Forsythe language.

absent : EmptyType

My intuition says that this is a rather bold design decision: it basically means that every record value contains every possible label, where the value is of type EmptyType. Wow!

If I remember correctly, at least in one of his papers Mitchell Wand represents records as total functions from labels to (values ∪ "absent").


ON EDIT: To complicate the matter, I believe that depending on definitions of A and Int, A ∩ Int can be EmptyType itself...

<myInterpretation>I interpret (the independently typed part of) the thesis as defining a lattice of types, with subtyping as partial order relation, intersection as supremum, union as infimum, Top and Void being the greatest and least types, so I should have called EmptyType Void, my mistake. While record types by definition of their subtyping always have non-Void intersection (having a union of their labels), the same cannot be said about all types.</myInterpretation>

Labels as functions

Can labels be used as functions?

F.e. if we have
map :: (a -> b) -> [a] -> [b]
can we then do
map l [{l = 1}, {l = 2}, {l = 3}]
and get
[1, 2, 3]

Labels are not functions

Hi Sjoerd,
You can not use labels directly as functions -- but you can use the label selection function of course:
map (\r -> r.l) [{l = 1}, {l = 2}, {l = 3}] == [1,2,3]
With Haskell section syntax, we can even make this more concise as:
map (.l) [{l = 1}, {l=2}, {l=3}]
Note that one now has the ability to pass labels as arguments and use them in many different ways: like selection, extension or update. If labels are treated as label selection functions this would not be possible.

All the best,
   Daan.

Not possible?

In my experimental language Moiell functions and labels share the same syntax (and the same implementation interface). f(x) and x.f are valid syntax for both labels and functions, also for assignment.

There are a lot of differences between Moiell and Morrow, but are you sure it's really not possible?

It is a choice

"Not possible" is probably too strong. What I mean to say is that you always need some way to distinguish between different label operations: selection, extension, update etc.

In a language without first-class labels, one can equate a label l to a label selection function, as in l r, and use special syntax for other operations, as in { l = 2 } and r{ l := 2}. Is this what you do in Moiell?

However, in a language with first-class labels, you also need to be able distinguish between labels as such, and label operations; i.e. do I pass the label l or do I pass a function that select label l? Equating labels with label selection is thus no longer possible in such language... hmmm, unless one uses special syntax to denote labels as such :-)

The label supports "the call operation"

In Moiell labels are just labels. But they are simply allowed where functions are also allowed, and there they operation as the label selection function.

It's like implementing __call__ in Python. Any object in Python that implements __call__ is a function.

In fact in Moiell there are no real functions. There are labels and closures, which both support the call operation. Even numbers and strings implement the call operation: 1(x) is a function call to the "function" 1, which just returns 1. So all constants are also constant functions.

intersectionTypes or intersection values?

The part about intBool made me jump a little. A value containing both a Bool and a Int (and not C like Bool being an Int)

Do you type infere this:

f x = if x then x + 1 else 0

Do you allow it?

Some time ago, i've been looking at intersection values from the usability point of view.
But i didn't get very far on it, concentrating (and failing) on other points.
http://merd.net/types.html#free_intersection,
http://merd.net/polymorphism.html

Intersection types are encoded

Morrow does not infer types for:

f x = if x then x + 1 else 0

But the example will be rejected as:

(1,28): Unable to unify types
 context       : 0
 term          : 0
 inferred type : Int
 does not match: Bool

Although Morrow can encode intersection types, it does not use them by default.

This is a bit like overloading in Haskell: an argument passed to a function is not suddenly overloaded, i.e.:

f (+)  = (1 + 2, 1.0 + 2.0)

Is surely rejected in Haskell.

Your example could work by explicitly selecting some field from a record (= the intersection type):

f x  = if x.@bool then x.@int + 1 else 0

And we could also select from any record that has a field of type Int and a field of type Bool:

f :: forall l m r. {l :: Int, m :: Bool | r} -> Int
f x   = if x.any then x.any+1 else 0

I think that it is possible in general to encode a language with built-in intersection types in Morrow by implicitly selecting an "any" field from every binder, and by lifting every defined value into a record type.

Very cool!

This is really clever, and very cool. (Thanks, Frank, for both posting it and nudging us to take a look...)

In the paper (section 5), he refers to "case" and "recElim" as "normal functions," and gives types for them that make perfect sense. Can these functions be defined using the primitive operations on records and variants? I can't figure out how to do it, but the cleverness of the "any" trick, among other things, makes me think I'm probably just missing something...

Thanks to Daan, too, for following this discussion.

Is "case" primitive?

Thanks Matt for being so positive. I am glad that I discovered this site, as there are many interesting discussions going on.

Regarding the implementation of the recElim and friends, I am afraid that one can't implement them using the normal (categorical) operations on records. However, we can implement some categorical operations in terms of the record/variant elimination/introduction rules. For example, field selection (*):

(.) :: r\l => {l :: a | r} -> Label l -> a
r.l  = recElim r <l = \x -> x>

Unfortunately, this is the only one I have been able to define up till now. If we would extend some operations, we could define more operations. For example, the "case" (or "varElim") could be extended with an extra argument serving as a default clause. However, I am not yet sure what the proper type of such function would be -- it seems we need a row concatenation predicate to make this work, ie. something like:

case' :: ((s ++ t) ~ r) => <r> -> {to s a} -> a -> a
case' variant pattern default = ...

Ah, there is still a lot of design to do here.


(*) This assumes a slightly more general definition of variant construction than described in the paper. The construction now delivers an open variant:

<_ = _> :: r\l => Label l -> a -> <l :: a | r>

Broken link. Here's a live