## Constraint typing, subtyping, and separate compilation

Can someone help me decode this comment from this post on Scala type inference:

On the other extreme, it is *possible* to do constraint typing in a nominal type system with subtyping, but you have to give up entirely on separate compilation. Type checking effectively turns into a special case of control-flow analysis, which not only makes the whole process exponential, but also completely kills the idea of reusable APIs and separately distributable libraries, since the entire program must be available to the type checker in order to definitively derive the union type at a particular declaration site.

Is that a theoretical result published somewhere? And when they say constraint typing, they mean "constraint-based typing" right?

## Comment viewing options

### Hm

I went to look at François Pottier's PhD thesis, which handles structural subtyping. In fact it seems to handle type inference in a modular way, in the sense that inferred constraints can be simplified separately for each type-checked program fragment (provided you have type scheme for the polymorphic identifiers you depend on (module interfaces), as is usual in ML-based languages).

There is a remark about the relation with control-flow analysis in the introduction:

So far, we have discussed typing and type inference only. Yet, it is important to realize that the same algorithms could be described in terms of abstract interpretation. Indeed, since each function application gives birth to a constraint, the constraint graph is in fact an apporximation representation of the program's data flow. For instance, in the absence of polymorphism, our system would be equivalent to a 0-CFA, as pointed out by Palbserg, O'Keefe and Smith [39, 40]. One can also compare our type inference system to a modular abstract analysis, such as SBA [25, 19]. In both cases, the analysis yields a description of the program's behavior, as a set of constraints. Typing is traditionally distinguished from abstract interpretation, because of its greater simplicity and lesser precision; here, the type systems at hand are precise enough that the distinction blurs, and both points of view are possible. Besides, note that this parallel with abstract interpretation allows understanding some of our simplification algorithms in a more graphic way: for instance, garbage collection simply consists in simulating a flow of data through the constraint graph, and in destroying all unused edges.

The discussion between Spiewak and Odersky in the link you cite would seem to indicate that nominal subtyping is harder in this case that structural subtyping. I'm much less familiar with nominal subtyping and don't have references for that.

### Guessing but...

This smells to me like a statement that involves some hidden assumptions. There is a sentence later in that comment (end of the same paragraph) that strikes me as very curious:

...the entire program must be available to the type checker in order to definitively derive the union type at a particular declaration site

This seems to suggest that the union has to be effectively closed (at least de practico). The only reason I can see to require that is that inequality type constraints are being replaced with enumerated constraints. I can see a bunch of reasons from an HM-style inference perspective why that might be a tempting approach, but I really don't see that it's a necessary approach. There has been some clever work on inference in the presence of nominal subtyping lately.

### There has been some clever

There has been some clever work on inference in the presence of nominal subtyping lately.

Cool, any pointers? I'm a bit frustrated now, it doesn't seem that hard for some result much better than Scala's.

### Ceylon

Getting F-Bounded Polymorphism into Shape, 2014
Ben Greenman, Fabian Muehlboeck, and Ross Tate

### Ah, this looks kind of like

Ah, this looks kind of like Jiazzi! Shapes are like classes imported into units. It also matches my experience with using open classes (type families) in Scala: the shape traits where never to be used directly as types, but as bounds for type parameters that were! This design was made illegal in later version of Scala for reasons that I didn't really understand at the time, but now make more sense. I hope Martin Odersky reads this paper.

### Interesting paper

Looks like something useful to understand. This example looks like a pretty clumsy encoding of a type class:

interface Graph<G extends Graph<G,E,V>,
E extends Edge<G,E,V>,
V extends Vertex<<G,E,V>> {
List<V> getVertices();
}
interface Edge<G extends Graph<G,E,V>,
E extends Edge<G,E,V>,
V extends Vertex<G,E,V>> {
G getGraph();
V getSource();
V getTarget();
}
interface Vertex<G extends Graph,
E extends Edge<G,E,V>,
V extends Vertex<G,E,V>> {
G getGraph();
List<E> getIncoming();
List<E> getOutgoing();
}
class City extends Vertex<Map,Road,City> {...}

### I don't think type classes

I don't think type classes can deal with family polymorphism, correct? How do you relate Map, Road, and City so that they are bound to Graph, Edge, and Vertex in the Map, Road, and City types?

See gBeta as a language that deals with such designs natively.

### Can you elaborate on the problem?

I meant something like:

typeclass Graph g v e where
getVerts :: g -> [v]
getEdges :: g -> [e]

getIncoming :: v -> [e]
getOutgoing :: v -> [e]
getGraph :: v -> g

...

instance Graph Map City Road where ...


Note: And I edited the parent comment to fix formatting of less-than.

### type families

Your type class version seems just equal to:

class Graph[G,V,E] { ... }


The point is not in the instantiation of the class, but in the implementation of it. You can define your base graph, and then evolve their implementations in an extension of the graph.

All you are doing here is tying the knot.

### Extension should be orthogonal

You're right that OO languages offer something that Haskell doesn't here (though the type class version doesn't look equivalent to one class Graph[G,V,E]). Namely, you can define an instance for (Map, City, Road) and then refine those classes (BigCity extends City, etc.) while keeping the same implementation of the Graph interfaces. Haskell isn't an OOP language, after all. But I think my point stands that the snippet I posted is a quite cumbersome way of specifying a theory of three related classes.

### How does haskell handle the

How does haskell handle the comparable interface? Can a type class refer to itself?

Edit: ok, this a dumb question. Type classes are just constructors...

### Haskell uses the Eq type

Haskell uses the Eq type class for comparables. Type classes are a mechanism for constraining types to have certain associated functions. So you write (Eq a => ...) to indicate that the type 'a' should have some associated function of type a -> a -> Bool.

So it would look like this:

interface Comparable[c extends Comparable[c]]
Bool operator == (c rhs)

class Foo implements Comparable[Foo]
operator ==(Foo rhs) { ... }


vs

typeclass Eq a
(==) :: a -> a -> Bool

...

instance Eq Foo
(x == y) = ...


Edit: It occurs to me that by comparable, you probably meant the Ord type class, not Eq, but that shouldn't change much...

### Ya, I realized this is super

Ya, I realized this is super easy in haskell, and doesn't require anything like recursion as it does in OOP.

### -

Redundant post, I just made the same point as Matt above.

### Constraints and Unification

Intuitively this may have something to do with constraint propagation. If you have inequality constraints, you cannot tell if unification fails without access to all constraints imposed top down. If you only have access to part of a program you cannot determine if a unification should fail.

If an fragment 1 a disequality is established between A and B, and in fragment 2 A and B are unified, only a single type is presented at the module interface, hence the incompatability of fragments 1 and 2 cannot be expressed at link time.

It doesn't have to be unification, here unification could be replaced by local resolution of an equality constraint. In effect in the presence of equalities and inequalities all constraints need to be resolved at the top level to find any possible contradiction. I think only equality-only constraint resolution is locally resolvable.

### Why can't you just resolve

Why can't you just resolve the constraints as you receive them? Whenever you get a constraint, just update your lattice right there and then.

### Because it doesn't work with type variables.

Because inequality requires the type variables to be grounded in order to discharge it. Equality is the only constraint you can immediately discharge with non ground type variables (by unification). My comment above is that after local unification, how do you express an inequality on something already unified?

I don't get what you mean by updating the lattice?

Edit: I am sure it can be done for some set of restrictions, but I don't know what restrictions. One thing you obviously can't allow is to declare some types in a header that get used in one module but get a subtyping constraint defined in a different module.

### It seems really possible by

It seems really possible by my experience.

I'm not unifying at all, and also I'm not allowing subtyping between type parameters yet, they are only f-bounded (e.g. T <: S is not possible if T and S are type parameters, but T <: Foo[S] is fine). You definitely can't allow subtyping after the fact between two previously visible type variables (don't violate truth in subtyping), but it might be doable for a fresh type variable.

### Other Subtypes

Do you consider Foo[S] <: Foo[T] simplifies to S <: T, and hence its still possible to introduce constraints on ungrounded type variables? Then we find we need to unify S and T, but of course we cannot violate the subtyping constraint when unifying, hence we don't know if unification should succeed until we know the concrete types of S and T.

### I don't allow Foo[S] Also,

I don't allow Foo[S] <: Foo[T] if S and T are both fresh type variables (instances are a different story). So current, S and T are not allowed to have a relationship.

Also, for Foo[TypeA] <: Foo[TypeB], the constituents aren't just related by TypeA <: TypeB or even TypeA = TypeB (well, if you want to be conservative), but something a bit more complex than that!

### Subtyping of Containers

Surely if I have a generic sort procedure that expects an array of numbers (number being the super type of int, float, int64, int32 etc), then:

Array Int <: Array Num


And:

sort :: (a <: Array Num) => a -> a


Should admit any container that has at least the same interface as Array, and contains something that at least has the same interface as Num.

### Yes, you at least get that.

Yes, you at least get that. You can get more than that also, unfortunately.

It might depend on whether your language is mutable or not, but even procedure arguments are contravariant, so I'm not sure.

### What More?

What more can you get (substituting an generic container for an array)? Surely:

a <: Container Num


Means 'a' must be something substitutable where I expect a Container of Num(s). It means exactly that and nothing more. Where's the problem?

f :: (a <: Container Num) => a -> IO ()


Is a function that takes an argument that provides at least the same interface as a container of nums, and we don't mind if it provides extra, as we never use it.

f :: (a <: Container Num) => IO a


Is a function that returns something that is at least a Container Num, IE something that can be used anywhere a container of nums is expected. Again we don't care if it provides extra functionality, because we will never use it.

Note I am using the IO monad because I don't want to deal with the details of what exactly is mutable.

Haskell doesn't have subtyping, so there isn't a problem. Ad hoc polymorphism is quite different from the subtype variety, it isn't about substitutability.

But if you have say b <: a, how are a.T and b.T related? Knowing nothing about variance, it just isn't b.T <: a.T since we need to derive w <: b.T if w <: a.T.

### Substitutability.

I'm not talking about Haskell, I am taking about a Haskell like language (with type classes, parametric polymorphism, similar type system) with the addition of subtyping. Type-classes behave exactly like Java interfaces, except for the static vs runtime difference.

What is wrong with the statement that a function that accepts a container of numbers should accept any subtype of container of any subtype of numbers? Isn't that clearly what we mean by:

a <: Container Number

### Type classes don't behave

Type classes don't behave like Java interfaces, even the designs they lead to are mostly very different.

It is not good enough, it doesn't deal with catcalls. Unless you mandate covariance, which in an immutable languag like haskell might be workable.

### Typeclass = Interface.

A type-class is just a record plus an implicit parameter. A record is an interface just like in Java, for example:

data R a = R {
f :: a -> a
}


interface R {
R f(R r);
}


And you could apply subtyping constraints as required:

data R a = R {
f :: (b <: a) => b -> b
}

data Animal = Animal
data Cat = Cat

-- Cat <: Animal

x = R Animal {
f = id
}

x.f Cat


You pass it an animal, it returns an animal. Where's the difficulty?

### This is an old

This is an old question/debate that I don't feel like is relevant to me right now. The debate is do we need first-class subtype polymorphism, or can it just be effectively simulated with type classes? Just adding a subsumption function doesn't really get you subtypes, though it gives you the subsumption aspect of subtyping (obviously). The problem/expressiveness with subtyping beyond subsumption involves its interaction with type parameters, and that's about it.

### Changing Types

If you have a type system with different semantics that admits the same values, where is the problem. I don't see the need to be attached to the mechanism, its the result that is important?

In any case the debate is settled, in that you can simulate with type classes, it was one of the results from the OOHaskell paper I worked on, linked to below. The syntax in Haskell is not very nice, but you could imagine a language with a syntax that looks like Java, but that uses a type system like Haskell's.

One of the interesting results is that you don't need to worry about co-variaiance or contra-variance unless performing an explicit narrowing operation. This is because the type checker looks at the individual method type signatures. You basically get structural subtyping (duck-subtyping) but you can do it nominally too. Really type-classes are the super type of all subtyping methods, and you could construct many object-oriented language behaviours within the sandbox of type-classes. The paper shows how to construct many OO language features, so that you can quickly build a DSL OO language prototype with sound type-checking.

### Again, this is a different

Again, this is a different more ideological argument to have some other time. There are good reasons Haskell hasn't taken over the world yet even with type classes, and maybe not so good reason also.

I now realise that the comment the original post refers to is talking about a method of type inference, and nothing to do with constraint types. It is referring to the technique of bidirectional inference where a set of constraints is gathered for a whole program and solved hence avoiding the left-to-right bias in something like Scala. I think this is actually a flawed argument though as my compositional typing algorithm solves this problem, at the cost of making the type signatures inferred in the 'interface' files more complex. See:

It is also flawed in a simpler perspective, if we simply require explicit type signatures at module boundaries, which is a good thing so we maintain stable interfaces between modules, then it becomes simply whole-module inference not whole-program inference, so I don't think its true on a much simpler level too.

Everything below continues the previous (misdirected) discussion, but I think its interesting so I am going to leave it here.

Trying to backtrack to where we diverged from the topic, Explicit subtype constrains seem equivalent to implicit subtyping, and clearly express co- and contra-variance. Also we should expect them to obey the relevant algebraic rules.

I think the original subject assumes we are dealing with an algebraic type system, and it is under these circumstances that the introduction of subtyping breaks separate compilation. If you don't allow all the axioms, the conclusion may not apply. However, as soon as you introduce generics, you will find you need all the axioms to allow generics to be used in all the ways people want to use them. This leads to trying to combine a type system with parametric polymorphism and function types with OO style subtyping, and that apparently leads to problems with separate compilation. I'm not so sure now, as with type-classes we don't have such problems, and if they subsume subtypes then subtypes should not have such problems either?

Subtype constraints in an algrbraic type system are the equivalent of 'subtypes' in a 'C++' like type system that lacks function types. I think it is actually a category error to try and combine the two. Subtyping in an algebraic type system _is_ subtype constraints.

It seems much better to use subtype constraints (and type-classes) to achieve the same restrictions on the value level code, and keep the type system simpler and easier to understand, whilst avoiding these problems. I find it really hard to see the difference between:

f :: (a <: Shape) => a -> a
g :: (a -> b <: Square -> Shape) => (a -> b) -> b -> b
h :: Shape -> Shape


and

Shape f(Shape)
Shape g(UnaryFunction<Square, Shape>, Shape)
// cannot express 'h'.


Apart from that I can ask for exactly a Shape in the former which seems more powerful, and you cannot clearly see the variance of generic 'Function' you would have to look at the definition to see which parameters are "in" or "out" in C# for example. These are two different notations for the same concepts, except information is clearly missing in the C++ style signatures. The problems stem from the lack of function types, in an algebraic type system 'a -> b' has clearly different subtyping rules than datatype 'X a b', whereas C++ style generics do not distinguish between 'UnaryFunction<a, b>' and 'Pair<a, b>'.

Thinking about the invariance of arrays, of course an array interface can have the desired variance, considering:

    set :: Array x -> Int -> x -> IO ()
get :: Array x -> Int -> IO x


Would have the correct variance on the element written or read form the array. It is the implicit array accessor a[i] that causes problems because the result is a reference that can be both input and output.

### Forget it

&nbsp;

Here's a reference to a paper I contributed to that covers the topic in depth.

http://arxiv.org/pdf/cs/0509027.pdf

### Fascinating

Good of you to edit your comment.

Sorry about that original comment, I am going through a phase of disliking casting at the moment. I plan on only providing type-based dispatch (maybe including a type-case statement) in the language I am working on.

In fact there are two different issues here, which are static subtyping, and runtime (dynamic) subtyping. I was trying to keep only to static subtyping. In dynamic subtyping, it looks like you need a cast, but in something like C++ its preferable to use the visitor pattern rather than casting. This is implemented only using vTable dispatch, and you could use this mechanism too for a type-case. Now there is no casting, and no need for runtime safety checks (as its simple vTable dispatch on two objects.

If we allow open data types and extensible records, and unify type classes and records using implicits (and modules too for good measure), then a single declaration will give us an interface that can be used in static subtype polymorphism (when used implicitly) or dynamic subtype polymorphism (when used explicitly).

### Maybe you misunderstood my argument

I think that I was basically agreeing with you. The initial argument about casting was to show that it's a special case of a general constructive definition of subtyping (but for some reason, many people seem to define subtyping strictly in terms of castability). A general witness to subtyping may require non-trivial work (as in C++ with multiple or virtual inheritance, not talking about visitors or unsound "up-casting" -- that's a different thing entirely).

But I cut out the argument since it seemed like an intolerable tangent to you.

### I probably misunderstood

I think I said you missed my point, but that doesn't mean it wasn't right or useful to someone else. I did feel it was a tangent, but I did not find it intolerable. I thought it was interesting, but I didn't really have the time to compare it to what I have been working on, but some of the parts of the OOHaskell paper on subtyping and variance dealt with something similar. In a way you were introducing (completely valid) details that I was trying to gloss over.

### Hm

Covariance is only sound for read-only arrays. So I suppose this particular sort function would take the covariant ReadOnlyArray generic subclass of the Array generic class.

### I think I've come up with an

I think I've come up with an encoding I think that avoids unification as well as a covariance distinction. We divide the type into multiple channels and "tunnel" parts of the type into latent channels that become active only on explicit assignment.

But that is what I meant: there is something more to a

### Use-case variance

Seems related to use-case variance (which was judged a too complex in the form that was integrated into Java; you can possibly do better with a new design). I guess we will have to wait for you to release you system.

### Not wild cards, really

Not wild cards, really transmitting parts of the type in covert channels. It is quite weird and I wonder if it really works. The math seems to work out but why it works out I have no clue!

I'll release videos and an initial draft of a paper pretty soon. I'm a bit far away from a release (mostly related to the live programming features I'm also working on :) ).

### Mutable arrays are invariant

Mutable arrays are invariant, so (assuming that Array is a mutable type) a <: Array Num is the same as a == Array Num. The correct type for the sort function would then be (since it uses the argument in a read-only way):

 sort :: (a <: Num) => Array a -> Array a 

### Indeed

You are right, and this is not a good example to demonstrate container variance. A bound (a <: Ord[a]) would be even better.

### I Disagree (sort of).

The simple goal, a function that can sort a (mutable) container of numbers, whatever the type of the container (providing it conforms to at least some interface) of elements (providing they conform to at least some interface).

This is no problem... we can be polymorphic on the array contents. We can read an element for the array and write it back. There is no problem with mutable arrays. (warning: I am probably making some different assumptions to you).

The generic function will of course use associated types, so the value read from the array has ValueType(I) where I is the iterator on the array (we can get the type of the iterator from IteratorType(C) where C is the container). Consider:

typeclass Container c where
data IteratorType c :: *
begin :: c -> IteratorType c
end :: c -> IteratorType c

typeclass Eq i => Iterator i where
data ValueType i :: *
succ :: i -> i
pred :: i -> i

typeclass Iterator i => Readable i where
source :: i -> IO (ValueType i)

typeclass Iterator i => Writeable i where
sink :: i -> ValueType i -> IO ()

reverse :: (Container c, Readable (IteratorType c), Writable (IteratorType c)) => c -> IO ()
reverse x = reverse' (begin x) (end x) where

reverse' f l | (f == l || succ f == l) = return ()
reverse' f l | f /= l = do
a <- source f
b <- source l
sink f b
sink l a
reverse' (succ f) (pref l)


Actually I see the point, that the above assumes all elements in the array are the same type. It still should be possible, in that we can use a polymorphic copy/clone operation, and of course in the underlying implementation the array would really be an array of object handles where the actual objects are on the heap. The implementation is going to be pretty complex in a language like Haskell.