Objects to Unify Type Classes and GADTs

Objects to Unify Type Classes and GADTs, by Bruno C. d. S. Oliveira and Martin Sulzmann:

We propose an Haskell-like language with the goal of unifying type classes and generalized algebraic datatypes (GADTs) into a single class construct. We treat classes as first-class types and we use objects (instead of type class instances and data constructors) to define the values of those classes. We recover the ability to define functions by pattern matching by using sealed classes. The resulting language is simple and intuitive and it can be used to define, with similar convenience, the same programs that we would define in Haskell. Furthermore, unlike Haskell, dictionaries (or
objects) can be explicitly (as well as implicitly) passed to functions and we can program in a simple object-oriented style directly.

A very interesting paper on generalizing and unifying type classes and GADTs. Classes are now first-class values, resulting in a language that resembles a traditional, albeit more elegant, object-oriented language, and which supports a form of first-class "lightweight modules".

The language supports the traditional use of implicit type class dispatch, but also supports explicit dispatch, unlike Haskell. The authors suggest this could eliminate much of the duplication in the Haskell standard library of functions that take a type class or an explicit function, eg. insert/insertBy and sort/sortBy, although some syntactic sugar is needed to make this more concise.

Classes are open to extension by default, although a class can also be explicitly specified as "sealed", in which case extension is forbidden and you can pattern match on the cases. Furthermore, GADTs can now also carry methods, thus introducing dispatch to algebraic types. This fusion does not itself solve the expression problem, though it does ease the burden through the first-class support of both types of extension. You can see the Scala influences here.

I found this paper via the Haskell sub-reddit, which also links to a set of slides. The authors acknowledge Scala as an influence, and as future work, suggest extensions like type families and to support more module-like features, such as nesting and opaque types.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

As another point to ponder,

As another point to ponder, I wonder whether the specification of methods could be separated from the object declaration into a "flat" pattern matching function, ie. no nested patterns, just a constructor match. This wouldn't have many of the thorny exhaustiveness checking issues, and would keep the algebraic types declarations more succinct, and resembles a more functional style.

Link broken

Your link to the paper is mis-pasted, but the fixed link seems to be down at the moment anyway.

Edit: It's back up now.

Fixed, thanks. The link

Fixed, thanks. The site works for me, so maybe it's back up. Here's another link to the paper, but it looks it's one page shorter, and the date is a little earlier.

Dependent records as type clases?

How does this work compare with Sozeau and Oury's work [paper] [slides] on modeling type classes with dependent records in Coq? (AFAICT, this has never been discussed on LtU.)

IIRC, it turns out that Coq's type classes could not be extracted to Haskell type classes because the latter were not general enough. Does this work resolve that particular shortcoming?

Dependent records --> Haskell Type Classes

Presumably not, since one can have dependent records that contain types, and this proposal doesn't include anything to do with associated types (it's listed at the end under future work).

The objects proposal can (if I'm reading correctly) be desugared into Haskell functions and datatypes (you probably need GADTs), and you can't really translate everything in a dependent record ala Coq directly into Haskell.

Position information

So far I'm really liking the ideas in this paper. I will have to give it some more thought and translate some existing programs to this new system.

I have one question, regarding section 3.6 "Sealed Classes with Methods":

Location information is added uniformly to all Exp constructors through a class method loc. Computation of loc is always in terms of the locations of a node (object)'s substructures, except for Lit, which has no substructures. Lit.loc always evaluates to Nothing. Doesn't this mean that exp.loc always evaluates to Nothing, for all expressions exp?

Presumably this can be fixed by making the location information an argument to object Lit, but then the claim that code that does not need the extra location information does not need to be changed is no longer true: all pattern matches on Lit (such as the one in eval) will need to match an extra argument.

Am I missing something or is the stated claim false?

Good point. This proposal

Good point. This proposal does not seem to offer sufficient encapsulation in order to satisfy this requirement. Solving this would seem to require an existential accessible only by the method, and the number of parameters when deconstructing would need to be strictly less than or equal to the number of parameters to the constructor.

Thus, you would only need to modify every point where a Lit is constructed, rather than deconstructed, which seems to be as good as it gets. I don't think this ability is available in functional languages, so I'm curious if there are any better solutions without appealing to extensions.

Anonymous Objects & Object Update

Glad your are enjoying the paper Martijn.

You are right to complain about this example. We have been lazy in fixing a couple of things in the draft, including to clarify how this example should work. Indeed if you use something like:

t1 = Div (Plus (Lit 3) (Lit 4)) (Lit 0)

then "Nothing" will always be the returned location. What you really want to do here is to add location information at construction time.

To do so, you can use anonymous objects:

t2 = Div (Plus (Lit 3 {loc = Just (1,1)}) (Lit 4 {loc = Just (1,5)})) (Lit 0 {loc = Just (2,1)})

t3 = Div (Plus (Lit 3) (Lit 4)) (Lit 0) {loc = Just (1,2)}

Anonymous objects allow initialization like anonymous classes in Java, but the actual mechanism that allows this is more general in the sense that updating objects is also possible. If you want to put things in terms of Haskell lingo, the mechanism above is providing similar functionality to record updates.

The code in the paper does not change. So our claims are valid.

Like the rest of the proposed features, anonymous objects are nothing but syntactic sugar. So, you can run the example in Haskell today (but not with the clarity nor the advantages of our approach :)). Here is the code:

module Objects where

type Loc = Maybe (Int,Int)

data Exp a where
Lit :: Loc -> Int -> Exp Int
Plus :: Loc -> Exp Int -> Exp Int -> Exp Int
IsZ :: Loc -> Exp Int -> Exp Bool
If :: Loc -> Exp Bool -> Exp a -> Exp a -> Exp a
Div :: Loc -> Exp Int -> Exp Int -> Exp Int

-- loc method accessor

loc :: Exp a -> Loc
loc (Lit l _) = l
loc (Plus l _ _) = l
loc (IsZ l _) = l
loc (If l _ _ _) = l
loc (Div l _ _) = l

-- constructors

lit :: Int -> Exp Int
lit x = Lit Nothing x

plus :: Exp Int -> Exp Int -> Exp Int
plus e1 e2 = Plus (loc e1) e1 e2

isZ :: Exp Int -> Exp Bool
isZ e = IsZ (loc e) e

iff :: Exp Bool -> Exp a -> Exp a -> Exp a
iff e1 e2 e3 = If (loc e1) e1 e2 e3

divv :: Exp Int -> Exp Int -> Exp Int
divv e1 e2 = Div (loc e1) e1 e2

-- evaluation

eval :: Exp a -> a
eval (Lit _ x) = x
eval (Plus _ e1 e2) = eval e1 + eval e2
eval (IsZ _ e) = eval e == 0
eval (If _ p e1 e2) = if eval p then eval e1 else eval e2
eval d@(Div _ e1 e2) =
if eval e2 /= 0 then eval e1 `div` eval e2
else error (show (loc d) ++ ": division by 0")

-- building some expressions

-- location is always Nothing here
t1 = divv (plus (lit 3) (lit 4)) (lit 0)

-- however, you can add the location info:
t2 = divv (plus (Lit (Just (1,1)) 3) (Lit (Just (1,5)) 4)) (Lit (Just (2,1)) 0)

t3 = Div (Just (1,2)) (plus (lit 3) (lit 4)) (lit 0)

Adding anonymous objects to the core language is fairly trivial. I've been postponing this for quite a while now, but I hope that I can restart work on this topic in the next few months.

class Set (S a)?

Looking at the sheets I wonder whether the author missed a point?

I thought it is legal to specify type classes on types of any kind in Haskell, so it would be entirely legal to specify a class on Set (S a) where S can be instantiated with any type constructor/container of elements of a, like List?

Page 5 of the slides

Hi Marco,

Not entirely sure if I understand your point, but what we mean by "classes as types" is something like what is on page 5 of the slides or Figure 1 in the paper. So type classes like "Eq" cannot be used on positions where types should occurs; and types cannot occur in positions where type classes should occur.

For another example, consider this:

p1 :: [a] => Eq a -> Int

This is not a valid Haskell type, right?

Does this help to clarify our point?

I admit I didn't totally get that

.. but I was reading the sheets and I thought a 'container' class such as Set can be specified as, pardon me Haskell, it's been a while,

class Set (S a) where
insert: a -> S a -> S a
union: S a -> S a -> S a

I.e. you can specify type classes over parameterized types, which makes some of the claims moot, although no doubt it's still an excellent paper.

Let me put it another way

I think what you mean is:

class Set s where
insert :: a -> s a -> s a
union :: s a -> s a -> s a

Indeed this is perfectly valid Haskell code. Let's call this solution A.

One of the things that are going on about in the paper (among others) is that the following is not valid Haskell code:

class Set a where
insert :: a -> Set a
union :: Set a -> Set a

Let's call this solution B.

In section 2.3 of the paper we discuss both solutions, so we do acknowledge that it is possible to work with container types based on Haskell type classes. We call (variants of) solution A bulk types (if you look at Peyton Jones paper, you'll see your proposed solution there). We refer to solution B as ADT.

We wrote the current draft a while ago, and I must say that the terminology used there is wrong. William Cook makes a very nice argument to what the difference between solution A and B are. If you haven't read:

On Understanding Data Abstraction, Revisited

I strongly suggest you to do so. This explains the difference between solution A and B, and settles the terminology in a much nicer way than we have done in the current text. In his terminology solution A is essentially just a (signature of an) ADT, while solution B is just the interface of an object. Those things are quite different beasts, an this essay explains nicely why this is the case.

In any case, I think the confusion that you mentioned regarding 'container' types comes from our related work, where we claim that with some of the Haskell extensions:

"we cannot, in general, define datatypes/container types such as our Graph, Set , Format and Exp examples in those approaches"

Perhaps, less ambiguously, we should have wrote:

"we cannot still have class definitions for object interfaces
like our Graph, Set , Format and Exp examples"

Or, in other words, in those extensions, classes are still not types (just like in current Haskell).

In any case, thanks for the comment, I think the original text may me a bit misleading in that respect.

Right..

Nah, I got it. It was a case of sloppy reading, and being used by my some convention on my own language now. Great paper.

Attribute Grammars

FWIW, class methods seemed closer to smart constructors than attribute grammars.

How about a direct comparison to Scala's analogous features?

One thing I haven't been able to find in relation to this work is what it offers compared to the Scala features that inspired it. I know Scala pretty well but Haskell still makes my head hurt, so I'd appreciate a hand here. :)

Curious subtleties

I noticed the following connundrum while commenting on a draft follow-on paper for Bruno, but the existing paper provides enough context to illustrate the point I stumbled over.

Because constraints are types in Scala, any function of the form

Constraint 'a => 'a -> bool

can be rewritten as an unconstrained function taking an additional argument:

Constraint 'a -> 'a -> bool

Bruno proposes to add a keyword "implicit" to Scala, with the intent that a parameter marked "implicit" is optional. If omitted, the required value will be taken from the unique, global, implicit value (in Scala-speak: object) of that type. This is somewhat like default parameters, but is really intended to provide "fill in the blank" behavior for type classes. It lets him write a single procedure:

sortList : implicit Ord 'a -> ['a] -> ['a]

where the user can explicitly provide an ordering rule or (by omission) use the default ordering rule for the type. So far so good.

But note that Ord 'a is in turn constrained by Eq 'a. So if we start with the original, unsimplified type of sortList:

sort: Eq 'a, Ord 'a => ['a] -> ['a]

and move the Ord constraint to argument position as an implicit parameter:

sort Eq 'a => implicit Ord 'a -> ['a] -> ['a]

we have not automatically discharged the Eq 'a constraint, and we cannot erase it! We know that some definition for Eq 'a was required in order for Ord 'a to be instantiated, but if some variant Ord 'a is provided as an argument we don't know which variant of Eq 'a was used in that definition of Ord 'a. This leads to the potentially awkward outcome that the ord.<= operation might use a different criteria for equality than the == operation that is used within the implementation of sort. Even if it intends/desires to remain consistent, there is no way for the sort implementation to know that it is using the same sense of equality. If we move the Eq 'a constraint:

sort: implicit Eq 'a -> implicit Ord 'a -> ['a] -> ['a]

then the caller is free to provide unrelated definitions of Eq 'a and Ord 'a.

Note further that moving the Eq 'a constraint to the instance/object definition doesn't really help any.

In the Scala scheme, the Ord constraint type class could be written in two ways (in quasi-Haskell, since I don't know Scala):

Eq 'a => class Ord 'a where ...
class Ord 'a (eq :: Eq 'a) where ...

the first says that in order to instantiate Ord 'a we must show some satisfaction for Eq 'a. The second says that we must provide a *specific* satisfaction of Eq 'a at the instance object definition as an argument to the constructor for Ord 'a. That is: Ord 'a has a field that provides a dictionary pointer.

Why a public field? Why not just a field that becomes part of the internal (existential) type? Because there are situations where the user of a particular Ord 'a needs to know that they are using the same definition of equality that was used by that particular instantiation of Ord 'a...

My first thoughts here are (a) this seems to have great power, but also great potential for confusion, but (b) without the ability to require this field at the type class definition, we have no way to state the requirement that each instantiation must expose the instantiation of Eq 'a that was used for that instance of Ord 'a.

And just for completeness, note that inheritance would not simplify matters here.

I just sent a note off to Bruno about this. I suppose my summary comment for the moment is that this stuff is damned subtle, and adding the flexibility of explicitly named dictionaries/constraints seems to impede the compiler's ability to inline fundamental type classes.

It is a know issue

Neither the system proposed in this paper nor Scala implicits (credits due to Martin Odersky, not me :)) do constraint simplification. This also means that we do not lose any late binding ability.

In Scala it doesn't even make sense to talk about constraints, because there is no actual notion of a constraints (only implicit parameters). This is also true to an extent in our system, because we really interpret constraints as implicit parameters. This is related to the semantics of constraints: set semantics for Haskell; tuple semantics for our system.

The issue that you mention has been debated in essentially all the proposals extending or inspired by type classes. Namely "Named instances for haskell type classes", "Making implicit parameters explicit" and our paper. You can find a summary in Section 5, under the paragraph "Named Instances and Explicit Implicit Parameters".

The paper "Named instances for haskell type classes" may be interesting to look at if you'd like to see a system where you can still perform constraint simplification and yet support explicitly passed instances. Of course you cannot get both at the same time :).

Hope this helps!

Thanks...

That certainly is helpful. My concern about the user-perceived complexities of composition with named instances is probably still relevant. The problem there isn't so much a matter of early vs. late binding as it seems to be a matter of inherent ambiguity arising from the presence of multiple possible instance resolutions. I probably need to catch up on the paper you pointed me at before I can talk about this sensibly.

The rest of my comment is mainly driven by trying to understand how to apply these ideas effectively given the differing language design objectives between BitC and Scala.